It looks like the reason these jobs arn't processing is because
abandoned jobs are being considered as part of the acquired jobs
count, so if a job gets abandoned it keeps taking up a slot in
our job pressure calculation. These shouldn't count as pressure
because they are not running anymore.
Change-Id: I44fbce2b7dc47345ab0e3745d1653f418d75943d
Trying to run this script in the cluster fatals out due to memory
problems somewhat regularly. The --start option helps to restart
it where it fell down, but when trying to run against hundreds of
wiki's that is a one-off solution that makes ensuring everything is
actually visited a pain.
To try and isolate errors add an option to push the parsing into the
job queue. There is still the possibility to miss pages, but job queue
retries should take care of us for the most part. Attempts to keep
load down on the databases by making sure no more than a specified
number of jobs are queued/processing at a given time.
Bug: T152155
Change-Id: I3a4e3a415b2f03de0bb36ac0515241e950130fde