The query itself is too expensive to be run on large Wikimedia wikis. So
put it behind WAN cache and touch the check keys for each category
whenever those have errors added or deleted from them.
If this happens to get out of sync, it will get fully refreshed
regularly when the totals are sent to statsd.
WANObjectCache's 'lockTSE' feature will help avoid cache stampedes that
made this query expensive in the past.
Change-Id: I3774103a29fa0f29d36283950f136259fa71bffe
This way we can track the progress of individual wikis in cleaning up
errors. The wiki name is at the end of the key so we can still use
e.g. "linter.category.$name.*" to see across all wikis at once.
Change-Id: I62463b9256e125d32d97396bd939334d71b46027
Move location to two separate columns in the database: linter_start and
linter_end. This allows us to have the database enforce the uniqueness
of those fields, instead of just relying upon the PHP code to do so,
which could be bypassed since we have multiple servers and concurrent
processes.
Change-Id: I3e67ce1b7cb3c93866a388ec3248af4cff2a81e0
It's possible to have duplicate, identical lint errors if the same exact
error is repeated in a template transclusion (e.g. {{1x|<b/> <b/>}})
since the position via dsr is the same. In this case, just de-duplicate
the errors since we can't differentiate them.
At the same time, trim excessive errors on the same page in the same
category. It's most likely that if a page has that many of the same
errors, the editor or bot will just fix all of them at the same time, so
we don't need to include all of them in the database. 20 is kind of a
low value, but we can always increase it later on as necessary.
Change-Id: I9cded720169870d0eea574e1a930ce4e9b190ac0
The job queue will allow us to have better flood control and rate
limiting instead of trying to do all the database writes as soon as
parsoid contacts MediaWiki.
On the downside, this means it may take longer for changes to be
reflected in the database and to users, but we already have no promise
for that, so it seems okay.
Note that if you don't have a job queue runner set up, you'll need to
run the runJobs.php script every time to have the jobs execute.
Change-Id: I25fd54734aca4dab09711e7f6aee027654931300