When $wgPageImagesLeadSectionOnly is true restrict
the scoring of page images to images in the lead section.
This results in an additional parse of the lead section of the
content, but this should only happen once after an edit
has occurred and is deferred.
If false all images will be considered as candidates for
the page image choice.
Note that the choice between modes is per site not by
request. This is intentional to avoid having to store 4
different properties respecting license and
article position. As a result when enabling or disabling
this switch on existing setups, there will be a transitional
period where pages previously parsed will show pageimages
as calculated by the previous value of this config variable.
Bug: T87336
Change-Id: I09bdae82515f6e93f5606553259f10b3a10e9eaa
Trying to run this script in the cluster fatals out due to memory
problems somewhat regularly. The --start option helps to restart
it where it fell down, but when trying to run against hundreds of
wiki's that is a one-off solution that makes ensuring everything is
actually visited a pain.
To try and isolate errors add an option to push the parsing into the
job queue. There is still the possibility to miss pages, but job queue
retries should take care of us for the most part. Attempts to keep
load down on the databases by making sure no more than a specified
number of jobs are queued/processing at a given time.
Bug: T152155
Change-Id: I3a4e3a415b2f03de0bb36ac0515241e950130fde
The API accepts a new query parameter `license`, whose
value can either be `free` or `any`. `free` is the default value.
When the value of `licence` is:
* `free`, then only the best image whose copyright allows
reusing it will be returned;
* `any`, then the best image, regardless of its copyright
status, will be returned.
Bug: T131105
Change-Id: I83ac5266e382d2d121aff3f7d28711787251c03b