The EXPLAIN in P70205#281191 shows the PRIMARY index being used,
resulting in a long scan in newer categories where all the linter_id are
higher.
Bug: T200517
Change-Id: I373692942121ff555565c9c2c310087cd097ef21
Array filtering/manipulation is only useful if wpNamespaceRestrictions is set and a useable value,
so only manipulate canonical namespace list if we're going to use it
Change-Id: Ib6d0884f396ca6e0b32817d9b4b90a0de36ba707
Add MigrateNamespace and MigrateTagTemplate as post database update
maintenance operations.
Bug: T367207
Change-Id: I7676f9ce4bef59febc463d897cb26d47347a3968
This avoids a duplicate parse with DiscussionTools (T376325) and also
reduces some redundancy by using the metrics-gathering code from
ParserOutput instead of having to clone it here. Finally, it allows
the parse to use the output of a previous parse for selective
update.
Bug: T376325
Follows-Up: I64a4556a74da4f735a5b562070c21310ecda36d1
Change-Id: I11386e307caaa9fce34870b08bd4dce4c5e6eb25
This ensures that all parsoid parses are accounted for in our
statistics. In the future we might want to query the cache for
an existing 'dirty' parse in this codepath to potentially allow
for selective update, but for now assume that selective updates
are not possible here.
Bug: T371713
Depends-On: I5b8c7ab48d5a1d6c1e311149fcac6abdc523aa13
Change-Id: I391e928175f60a1ff2e5c181e20ed72efe4dfd66
We don't need to directly handle the ParsoidParserFactory in the
LintUpdate job; use the existing ContentHandler pathways to reduce
dependencies.
Change-Id: I64a4556a74da4f735a5b562070c21310ecda36d1
I015fbe2ab613619c8805d12bfd397cc08450ef24 falsely assumed these were
ending up in logstash already but, unless explicitly asked, logstash
doesn't go below "info", regardless of the channel's level.
Change-Id: I55884a2535e839ca92d5d679cc4dc7911050f298
* When the user does not specify any namespace in a category
subpage search it should return all namespaces. This duplicates
the behavior for when the category is first selected from the
main Linter page.
Bug: T361081
Change-Id: Iccb195bf1b679e6e0165e4b1dde6e8d84db4d5b0
* Removed the write and user interface config variables and
fixed the tests affected by their removal.
Bug: T331883
Change-Id: If44ceedae7278f498158b8cdd528dfa32be609eb
When RESTBase is turned off, Parsoid runs will no longer be triggered
on template changes. This creates a new mechanism to do that, based on
the RevisionDataUpdates hook called by DerivedPageDataUpdater. The new
behavior is controlled by a feature flag, LinterParseOnDerivedDataUpdate,
which is enabled per default. In WMF production, this should be
turned off as long as we are still triggering Parsoid parses through
the pregeneration mechanism in RESTBase.
Note that this will not write ParserOutput to the ParserCache. On edits,
pages will get parsed with Parsoid twice, once to trigger the lint data
update, and once by ParsoidCachePrewarmJob to populate the ParserCache.
Both parses will trigger the ParserLogLinterData hook, the lint data
from the second parse is redundant.
However, while ParsoidCachePrewarmJob and RevisionDataUpdates get
triggered together on edits, they also get triggered separately:
ParsoidCachePrewarmJob by page views with parser cache misses; and
RevisionDataUpdates when pages get invalidated due to template changes.
Because ParsoidCachePrewarmJob and RevisionDataUpdates generally get
triggered in different situations, it seems cleaner to keep the two
mechanisms independent of each other, and live with the duplicate parse
on edit.
Bug: T361013
Change-Id: If53841ee583ce240dd245d640b9ea9c97e1eaa55
Since linting is currently tied up with read views, don't let a failure
to enqueue block a parse.
Bug: T364229
Change-Id: I9e8391d9f193aef72ca13ccda8ff6ab58ffc34da
Add support for new parsoid lint, missing-image-alt-text
Matches on images that don't have an alt text attribute at all
(empty alt attributes count as present).
Intended to make it easier to put workflows around these images,
including streamlined workflows for "microcontributions" in the
mobile apps.
As this has some impedence mismatch with usage of Special:LintErrors
this is marked as hidden (priority=none) so will not be displayed by
default, but has an enum value reserved for it and can be queried
explicitly.
Bug: T344378
Change-Id: I38cc1abbece3cca8155bec1f071b854027be0966
Additional debug logging allows us to verify the upcoming changes
in If53841ee583ce in production.
Bug: T361013
Change-Id: I261aacc1c9fa6483d88e94424d1f77d861f1a990
This had previously been fixed in
I2610b9b16d4032b0e18b3537cc9ed51bfdaff299 but a poor refactoring in
Ib3d3622144b670ebe1a4ce04e6db6811584d42c8 reintroduced it.
Bug: T363682
Change-Id: I378e802753c4284e7c5ec65148b43e0b41784cf3
Presumably it would be better if category priorities lived in their own
table so that we could do a join rather than an ever growing where in
clause. That would help Quarry users as well.
Bug: T334527
Change-Id: Ibd535a54565f6f474346c44ad7597fa0532faf6c
Instead, pass the page id when using methods for a page. The change
avoids constructing Database a dummy page id when those methods aren't
going to be used.
getFromId doesn't seem like it needs a page id, since the linter id is
the primary key.
Also, a namespace id should no longer optional to setForPage. The
LinterWriteNamespaceColumnStage option already gates whether to include
it in the row.
Follows-Up: I9fd6e7724dcf33be0b1feb19ec8eb448738cab09
Change-Id: Ib3d3622144b670ebe1a4ce04e6db6811584d42c8
And addresses some other cleanup from review comments.
Follows-Up: I9fd6e7724dcf33be0b1feb19ec8eb448738cab09
Change-Id: If87b0bf91930f0f8d89ed046d18aadb8f346f9aa
Database::updateStats moved to Database from RecordLintJob in
I2610b9b16d4032b0e18b3537cc9ed51bfdaff299 for reuse in Hooks but seems
better placed on TotalsLookup.
Change-Id: I600853e5cfc9e8abae9c6b07cee4c2adc37ef464
Invisible categories are permitted as categories, just not part of the
default set. 'invisible-categories' is removed, since it never worked.
Bug: T360064
Bug: T334527
Change-Id: Ie6b7a6d83349cbd2899e78bc18cc1629d710c6f0
Found usage of isset() on expression self::$allowedHtmlTags that appears
to be always set. isset() should only be used to suppress errors. Check
whether the expression is null instead.
See https://www.mediawiki.org/wiki/Manual:Coding_conventions/PHP#isset
Change-Id: I21483aab05292cfb802ff6a5e63013ecc02f5c13
When searching for a specific page title, it's necessary to specify
page_namespace, not just linter_namespace, so that the relevant index in
the page table can be used.
Submitting the form with an empty namespace box led to a search for
namespace zero, because getCheck() returns true for an empty string.
It's not easy to search for a title part in all namespaces. So drop
that hidden feature and interpret a title part with a missing namespace
as being a search for namespace 0.
It's possible to search for a category with an empty title and zero or
more namespaces. Implement the namespace filter in this case using the
linter_namespace field. But ignore the namespace filter if there is no
category, since there is no index on linter_namespace alone.
Bug: T360865
Change-Id: I00934eaaf1a99e4098f177166b43069d33d9f137