It looks like the reason these jobs arn't processing is because
abandoned jobs are being considered as part of the acquired jobs
count, so if a job gets abandoned it keeps taking up a slot in
our job pressure calculation. These shouldn't count as pressure
because they are not running anymore.
Change-Id: I44fbce2b7dc47345ab0e3745d1653f418d75943d
When $wgPageImagesLeadSectionOnly is true restrict
the scoring of page images to images in the lead section.
This results in an additional parse of the lead section of the
content, but this should only happen once after an edit
has occurred and is deferred.
If false all images will be considered as candidates for
the page image choice.
Note that the choice between modes is per site not by
request. This is intentional to avoid having to store 4
different properties respecting license and
article position. As a result when enabling or disabling
this switch on existing setups, there will be a transitional
period where pages previously parsed will show pageimages
as calculated by the previous value of this config variable.
Bug: T87336
Change-Id: I09bdae82515f6e93f5606553259f10b3a10e9eaa
Very few of these jobs seem to be finishing, due to some
replicas being so far behind we get DBReplicationWaitError
thrown, which causes the job to be restarted. Catch and log,
but don't stop processing because of it.
There is also a problem with jobs that blow out the memory limits,
but not sure what to do with that.
Change-Id: Idffcfad76936f5e62e9018c58f2cb57db35af4b8
Trying to run this script in the cluster fatals out due to memory
problems somewhat regularly. The --start option helps to restart
it where it fell down, but when trying to run against hundreds of
wiki's that is a one-off solution that makes ensuring everything is
actually visited a pain.
To try and isolate errors add an option to push the parsing into the
job queue. There is still the possibility to miss pages, but job queue
retries should take care of us for the most part. Attempts to keep
load down on the databases by making sure no more than a specified
number of jobs are queued/processing at a given time.
Bug: T152155
Change-Id: I3a4e3a415b2f03de0bb36ac0515241e950130fde
The mobileview API directly accesses the page image,
bypassing the PageImages API.
I1d35e965dc37c8c4ecdcc43313b3198e951e1978 fixes the issue for
the PageImages API.
This patch fixes the issue for the mobileview API.
Change-Id: If6cbb82f01fa298945c615a2f25e972a9d767d58
Page images was updated to have a split between the 'best' page
image, and the best free page image. Unfortunately the deployment
plan didn't take into account that the default 'free' would be
pointing to an unpopulated page prop, which will not be populated
until LinksUpdate has run for every page on every wiki which could
take weeks or months.
To restore some semblance of order, make the default point at the
currently populated field. A followup will need to be done to
populate the appropriate field.
Bug: T152155
Change-Id: I1d35e965dc37c8c4ecdcc43313b3198e951e1978
The API accepts a new query parameter `license`, whose
value can either be `free` or `any`. `free` is the default value.
When the value of `licence` is:
* `free`, then only the best image whose copyright allows
reusing it will be returned;
* `any`, then the best image, regardless of its copyright
status, will be returned.
Bug: T131105
Change-Id: I83ac5266e382d2d121aff3f7d28711787251c03b
This fixes a fatal which is blocking deployment of 1.29.0-wmf.1
refs T149059 fixes T149849
Bug: 149849
Change-Id: Ieb61585af3aa60b7af58597091151d0b494b2fd6
* Do not "return true" in hook handlers. I was told it's good practice
to either return false on failure, or nothing/null.
* Remove "static" from method that does not need to be static.
* Make some type hints more specific.
* Add missing imports. I wonder how this can work without the imports.
My PHPStorm complains.
Change-Id: Ia0e980ff99f0e004d700b22dd07ff17f04bed4ec
Fetch the extended metadata for every image, and if the NonFree key
(provided by CommonsMetadata) is present, rank down the image.
The check is run on LinksUpdate so changes to image copyright
templates won't have effect until the page is reparsed.
Unlike other weights, the work of extracting the weight factor from
the file/page is wholly done on LinksUpdate, instead of doing most
of it on parse and storing as ParserOutput extension metadata.
This is to avoid getting the article page parsing and the file
description page parsing conflicting with each other.
Bug: T124225
Change-Id: I21111ecbc80ded864806a2a93002f3254c3c68a9