Commit graph

235 commits

Author SHA1 Message Date
jenkins-bot f3e9a38c96 Merge "Force the use of the category index when paging by category" 2024-10-21 21:38:03 +00:00
Umherirrender ad1e665d5b Use namespaced classes
Changes to the use statements done automatically via script

Change-Id: I2883f6ccb1b3019cddcc2c957d022acf2d5c0ff7
2024-10-20 09:25:08 +02:00
jenkins-bot a6f9c364bf Merge "SpecialLintErrors: Reduce code always run in findNamespace" 2024-10-18 23:47:44 +00:00
Arlo Breault be1f60b6ea Force the use of the category index when paging by category
The EXPLAIN in P70205#281191 shows the PRIMARY index being used,
resulting in a long scan in newer categories where all the linter_id are
higher.

Bug: T200517
Change-Id: I373692942121ff555565c9c2c310087cd097ef21
2024-10-17 22:10:24 -04:00
jenkins-bot 724f836b1b Merge "Wire migration scripts to SchemaHooks" 2024-10-16 23:33:51 +00:00
Reedy e352f39fc3 SpecialLintErrors: Reduce code always run in findNamespace
Array filtering/manipulation is only useful if wpNamespaceRestrictions is set and a useable value,
so only manipulate canonical namespace list if we're going to use it

Change-Id: Ib6d0884f396ca6e0b32817d9b4b90a0de36ba707
2024-10-14 20:28:16 +00:00
Reedy f80076f5aa SpecialLintErrors: Fix phan failure due to MW core change
Caused by 7be5a303d162ad525e69fa57820ac6108ce175b2

Bug: T377145
Change-Id: Id9cd01a392ef620aecac6254f643c53a8f94a8c3
2024-10-14 15:43:39 +01:00
Isabelle Hurbain-Palatin 165354ab91 Wire migration scripts to SchemaHooks
Add MigrateNamespace and MigrateTagTemplate as post database update
maintenance operations.

Bug: T367207
Change-Id: I7676f9ce4bef59febc463d897cb26d47347a3968
2024-10-10 13:52:30 +02:00
C. Scott Ananian e6a510fbed Use ParserOutputAccess for LintUpdate job
This avoids a duplicate parse with DiscussionTools (T376325) and also
reduces some redundancy by using the metrics-gathering code from
ParserOutput instead of having to clone it here.  Finally, it allows
the parse to use the output of a previous parse for selective
update.

Bug: T376325
Follows-Up: I64a4556a74da4f735a5b562070c21310ecda36d1
Change-Id: I11386e307caaa9fce34870b08bd4dce4c5e6eb25
2024-10-02 20:06:15 -04:00
jenkins-bot 27b5eaeaf8 Merge "Collect selective update statistics from LintUpdate job" 2024-09-27 19:01:52 +00:00
jenkins-bot 75cd94bc4f Merge "LintUpdate: use content handler instead of directly invoking ParsoidParser" 2024-09-27 19:01:51 +00:00
C. Scott Ananian 0937838f1e Collect selective update statistics from LintUpdate job
This ensures that all parsoid parses are accounted for in our
statistics.  In the future we might want to query the cache for
an existing 'dirty' parse in this codepath to potentially allow
for selective update, but for now assume that selective updates
are not possible here.

Bug: T371713
Depends-On: I5b8c7ab48d5a1d6c1e311149fcac6abdc523aa13
Change-Id: I391e928175f60a1ff2e5c181e20ed72efe4dfd66
2024-09-19 14:00:48 -04:00
Arlo Breault 0dad8f46b0 Add a "duplicate-ids" lint category
Bug: T200517
Change-Id: Ifc3aeb167de8ef1c9a686919408d1d6fbd85c581
2024-09-17 19:43:44 -04:00
C. Scott Ananian ba41d323f9 LintUpdate: use content handler instead of directly invoking ParsoidParser
We don't need to directly handle the ParsoidParserFactory in the
LintUpdate job; use the existing ContentHandler pathways to reduce
dependencies.

Change-Id: I64a4556a74da4f735a5b562070c21310ecda36d1
2024-09-17 16:44:37 -04:00
Bartosz Dziewoński 539946a92a Remove some redundant checks
$sleep must already be an integer (due to type hints).

Change-Id: I468ea23cafd2706bdeb23676d8aab85daa09a599
2024-08-12 23:04:28 +02:00
Arlo Breault ed8e449e13 Drop disabled lints
Covered by RecordLintJobTest::testDropInlineMediaCaptionLints

Change-Id: I564389ec9bd20cf36ec7a9bf96b1aebf7777cbbc
2024-07-25 11:29:36 -04:00
Amir Sarabadani 388106b0a4 Stop storing missing-image-alt-text lints
Bug: T370304
Change-Id: Ib473ddcc09c6c02e450a8664d21acbddaa7b2505
2024-07-22 02:59:21 +02:00
jenkins-bot 08fbe80da9 Merge "Fix the Linter category subpage search when namespace field is blank" 2024-07-19 06:50:34 +00:00
Arlo Breault 489fe5a912 Revert changes in log levels
I015fbe2ab613619c8805d12bfd397cc08450ef24 falsely assumed these were
ending up in logstash already but, unless explicitly asked, logstash
doesn't go below "info", regardless of the channel's level.

Change-Id: I55884a2535e839ca92d5d679cc4dc7911050f298
2024-07-11 18:59:58 -04:00
sbailey 85ea579c97 Fix the Linter category subpage search when namespace field is blank
* When the user does not specify any namespace in a category
  subpage search it should return all namespaces. This duplicates
  the behavior for when the category is first selected from the
  main Linter page.

Bug: T361081
Change-Id: Iccb195bf1b679e6e0165e4b1dde6e8d84db4d5b0
2024-07-10 09:22:31 -07:00
Arlo Breault 054abb7915 Change some log levels to debug so logs can be suppressed from Logstash
Failing to inject is redundant with EventBus logs.

Change-Id: I015fbe2ab613619c8805d12bfd397cc08450ef24
2024-07-02 20:13:08 -04:00
sbailey 0dfaa5523e Remove linter tag and template dual mode config and code
* Removed the write and user interface config variables and
  fixed the tests affected by their removal.

Bug: T331883
Change-Id: If44ceedae7278f498158b8cdd528dfa32be609eb
2024-06-14 15:40:47 -04:00
sbailey 72653441b2 Remove linter namespace field dual mode config and code
* Manual tests completed and query code reviewed

Bug: T331883
Change-Id: Ie1628799bb40ad74a24ab57a27a4176c2364fb82
2024-06-14 09:29:07 -07:00
Umherirrender 2f18de6366 Use namespaced classes
Changes to the use statements done automatically via script

Change-Id: I1ff7952946b8795b443f97896d557bbbb5ebe2dc
2024-06-09 18:38:49 +02:00
jenkins-bot 10a9c5be5a Merge "Trigger Parsoid run when page metadata is being updated" 2024-06-04 16:31:06 +00:00
daniel 8b22ad5d78 Trigger Parsoid run when page metadata is being updated
When RESTBase is turned off, Parsoid runs will no longer be triggered
on template changes. This creates a new mechanism to do that, based on
the RevisionDataUpdates hook called by DerivedPageDataUpdater. The new
behavior is controlled by a feature flag, LinterParseOnDerivedDataUpdate,
which is enabled per default. In WMF production, this should be
turned off as long as we are still triggering Parsoid parses through
the pregeneration mechanism in RESTBase.

Note that this will not write ParserOutput to the ParserCache. On edits,
pages will get parsed with Parsoid twice, once to trigger the lint data
update, and once by ParsoidCachePrewarmJob to populate the ParserCache.
Both parses will trigger the ParserLogLinterData hook, the lint data
from the second parse is redundant.

However, while ParsoidCachePrewarmJob and RevisionDataUpdates get
triggered together on edits, they also get triggered separately:
ParsoidCachePrewarmJob by page views with parser cache misses; and
RevisionDataUpdates when pages get invalidated due to template changes.

Because ParsoidCachePrewarmJob and RevisionDataUpdates generally get
triggered in different situations, it seems cleaner to keep the two
mechanisms independent of each other, and live with the duplicate parse
on edit.

Bug: T361013
Change-Id: If53841ee583ce240dd245d640b9ea9c97e1eaa55
2024-06-03 16:50:17 -05:00
Arlo Breault b6ad29e86b Catch jobqueue errors when recording a lint job
Since linting is currently tied up with read views, don't let a failure
to enqueue block a parse.

Bug: T364229
Change-Id: I9e8391d9f193aef72ca13ccda8ff6ab58ffc34da
2024-06-03 13:31:40 -04:00
Brooke Vibber ea186c1cee Add hidden lint missing-image-alt-text
Add support for new parsoid lint, missing-image-alt-text
Matches on images that don't have an alt text attribute at all
(empty alt attributes count as present).

Intended to make it easier to put workflows around these images,
including streamlined workflows for "microcontributions" in the
mobile apps.

As this has some impedence mismatch with usage of Special:LintErrors
this is marked as hidden (priority=none) so will not be displayed by
default, but has an enum value reserved for it and can be queried
explicitly.

Bug: T344378
Change-Id: I38cc1abbece3cca8155bec1f071b854027be0966
2024-05-29 19:03:41 -04:00
jenkins-bot cdd68ce599 Merge "Logging: add debug messages in Hooks and RecordLintJob" 2024-05-22 20:40:14 +00:00
daniel 6c07b92097 Logging: add debug messages in Hooks and RecordLintJob
Additional debug logging allows us to verify the upcoming changes
in If53841ee583ce in production.

Bug: T361013
Change-Id: I261aacc1c9fa6483d88e94424d1f77d861f1a990
2024-05-22 21:06:50 +02:00
Arlo Breault 96d3f6814c Fix regression clearing lints on page deletion
This had previously been fixed in
I2610b9b16d4032b0e18b3537cc9ed51bfdaff299 but a poor refactoring in
Ib3d3622144b670ebe1a4ce04e6db6811584d42c8 reintroduced it.

Bug: T363682
Change-Id: I378e802753c4284e7c5ec65148b43e0b41784cf3
2024-05-04 14:53:17 +03:00
Arlo Breault 68dd2651bf Suppress hidden categories from subpage prefix searches
ie. When you type "Special:LintErrors/" in the search box.

Bug: T334527
Change-Id: I0bb478086d22b65ce8c5ad48db7f522ac974d95d
2024-04-18 20:23:37 -04:00
Arlo Breault 22c1bfb865 Omit lints in hidden categories from search results
Presumably it would be better if category priorities lived in their own
table so that we could do a join rather than an ever growing where in
clause.  That would help Quarry users as well.

Bug: T334527
Change-Id: Ibd535a54565f6f474346c44ad7597fa0532faf6c
2024-04-18 20:23:32 -04:00
Arlo Breault 261339c2a3 Inject Database into TotalsLookup
Change-Id: I01e6b89b4ce9b1cea241bba9cad7ef6673803166
2024-04-11 12:24:42 -04:00
Arlo Breault ffc266eae6 Drop DatabaseFactory, just have Database as the service
Change-Id: Id25271c82bc7ba833d32dff3fb11d3dfe15a3f02
2024-04-10 21:21:40 -04:00
Arlo Breault c04b075858 Stop constructing Database with a page id
Instead, pass the page id when using methods for a page.  The change
avoids constructing Database a dummy page id when those methods aren't
going to be used.

getFromId doesn't seem like it needs a page id, since the linter id is
the primary key.

Also, a namespace id should no longer optional to setForPage.  The
LinterWriteNamespaceColumnStage option already gates whether to include
it in the row.

Follows-Up: I9fd6e7724dcf33be0b1feb19ec8eb448738cab09
Change-Id: Ib3d3622144b670ebe1a4ce04e6db6811584d42c8
2024-04-10 21:07:08 -04:00
Arlo Breault 1c53684200 Construct services with ServiceOptions
And addresses some other cleanup from review comments.

Follows-Up: I9fd6e7724dcf33be0b1feb19ec8eb448738cab09
Change-Id: If87b0bf91930f0f8d89ed046d18aadb8f346f9aa
2024-04-10 12:34:05 -04:00
C. Scott Ananian 4f991b5d0c [DI] Clean up LintErrorsPager
Inject the services required by LintErrorsPager from the SpecialLintErrors
class.

Change-Id: Ie20e00cccef895fbad8536a94dfc1978f20c4220
2024-04-09 18:35:34 -04:00
C. Scott Ananian 633d6024a4 [DI] Make TotalsLookup an injectable service
Change-Id: I71d41ca5b0a901afd59950b3539d8e19c4cead5f
2024-04-09 18:35:32 -04:00
C. Scott Ananian 24f771a6a3 [DI] Make CategoryManager and Database injectable services
Change-Id: I9fd6e7724dcf33be0b1feb19ec8eb448738cab09
2024-04-09 18:33:13 -04:00
C. Scott Ananian fde916fff5 [DI] Use dependency injection for RecordLintJob
Change-Id: I3b8cd95e075af92c77a7dec4f12a0a81eab3ae4b
2024-04-04 21:42:10 -04:00
C. Scott Ananian c983a822e3 [DI] Use dependency injection for Hooks
Change-Id: I23f56b0a3df1ef206ec160453294349d2482435f
2024-04-04 18:43:13 -04:00
C. Scott Ananian d71a297781 [DI] Use dependency injection for ApiQueryLinterStats
Change-Id: I5f5d3a226a9f7b733a6f07200216a1192115b102
2024-04-04 18:43:13 -04:00
C. Scott Ananian d8970278d1 [DI] Use dependency injection for SpecialLintErrors
Change-Id: I211d70d5fb4a321cf302cc10f6e160480468a347
2024-04-04 18:43:10 -04:00
Arlo Breault 6304fc5e08 Stop exposing hidden categories in siteinfo
Suppresses them from ?action=query&meta=siteinfo

Bug: T334527
Change-Id: I325e78e438a8385948071d2b4ba8a8c4407d5fc4
2024-04-04 16:04:39 -04:00
Arlo Breault 8d49b68ba5 Move Database::updateStats to TotalsLookup
Database::updateStats moved to Database from RecordLintJob in
I2610b9b16d4032b0e18b3537cc9ed51bfdaff299 for reuse in Hooks but seems
better placed on TotalsLookup.

Change-Id: I600853e5cfc9e8abae9c6b07cee4c2adc37ef464
2024-04-02 17:12:24 -04:00
Arlo Breault 397b36e8e3 Don't include hidden category counts in page info
Bug: T337275
Bug: T334527
Change-Id: I6439df894c06fc5592422e72dac04150591f4033
2024-04-02 15:12:19 -04:00
Arlo Breault d6514cfa3b Fix invisible categories in ApiQueryLintErrors
Invisible categories are permitted as categories, just not part of the
default set.  'invisible-categories' is removed, since it never worked.

Bug: T360064
Bug: T334527
Change-Id: Ie6b7a6d83349cbd2899e78bc18cc1629d710c6f0
2024-04-01 21:43:24 -04:00
Umherirrender 91848725e7 Replace isset() with null check in HtmlTags
Found usage of isset() on expression self::$allowedHtmlTags that appears
to be always set. isset() should only be used to suppress errors. Check
whether the expression is null instead.
See https://www.mediawiki.org/wiki/Manual:Coding_conventions/PHP#isset

Change-Id: I21483aab05292cfb802ff6a5e63013ecc02f5c13
2024-04-01 13:47:54 +02:00
Tim Starling 4dd75df2e8 Fix index usage when searching for page titles
When searching for a specific page title, it's necessary to specify
page_namespace, not just linter_namespace, so that the relevant index in
the page table can be used.

Submitting the form with an empty namespace box led to a search for
namespace zero, because getCheck() returns true for an empty string.
It's not easy to search for a title part in all namespaces. So drop
that hidden feature and interpret a title part with a missing namespace
as being a search for namespace 0.

It's possible to search for a category with an empty title and zero or
more namespaces. Implement the namespace filter in this case using the
linter_namespace field. But ignore the namespace filter if there is no
category, since there is no index on linter_namespace alone.

Bug: T360865
Change-Id: I00934eaaf1a99e4098f177166b43069d33d9f137
2024-03-27 11:44:59 +11:00