This avoids a duplicate parse with DiscussionTools (T376325) and also
reduces some redundancy by using the metrics-gathering code from
ParserOutput instead of having to clone it here. Finally, it allows
the parse to use the output of a previous parse for selective
update.
Bug: T376325
Follows-Up: I64a4556a74da4f735a5b562070c21310ecda36d1
Change-Id: I11386e307caaa9fce34870b08bd4dce4c5e6eb25
This ensures that all parsoid parses are accounted for in our
statistics. In the future we might want to query the cache for
an existing 'dirty' parse in this codepath to potentially allow
for selective update, but for now assume that selective updates
are not possible here.
Bug: T371713
Depends-On: I5b8c7ab48d5a1d6c1e311149fcac6abdc523aa13
Change-Id: I391e928175f60a1ff2e5c181e20ed72efe4dfd66
We don't need to directly handle the ParsoidParserFactory in the
LintUpdate job; use the existing ContentHandler pathways to reduce
dependencies.
Change-Id: I64a4556a74da4f735a5b562070c21310ecda36d1
Historically, Parsoid would suppress emitting lints for fostered content
that was purely rendering transparent since it's common to put include
directives, categories, etc in fosterable position and it made no
difference to rendering.
However, clients like DiscussionTools can benefit from this knowledge,
especially outside of templated content where it could result in edit
corruptions.
A separate category is used to avoid disrupting the work of editors
cleaning up lints in the "fostered" category, as in T369317.
Bug: T371142
Bug: T290936
Bug: T369317
Change-Id: I3519d86898df262eaea1a3303130453497ff27aa
It's not actively used at this time and it's causing a lot of writes,
affecting production. Disabling it should be harmless and reduce load.
Bug: T370304
Change-Id: I2170f657088993dd3fb81a9601284d3af7fc1883
* Removed the write and user interface config variables and
fixed the tests affected by their removal.
Bug: T331883
Change-Id: If44ceedae7278f498158b8cdd528dfa32be609eb
When RESTBase is turned off, Parsoid runs will no longer be triggered
on template changes. This creates a new mechanism to do that, based on
the RevisionDataUpdates hook called by DerivedPageDataUpdater. The new
behavior is controlled by a feature flag, LinterParseOnDerivedDataUpdate,
which is enabled per default. In WMF production, this should be
turned off as long as we are still triggering Parsoid parses through
the pregeneration mechanism in RESTBase.
Note that this will not write ParserOutput to the ParserCache. On edits,
pages will get parsed with Parsoid twice, once to trigger the lint data
update, and once by ParsoidCachePrewarmJob to populate the ParserCache.
Both parses will trigger the ParserLogLinterData hook, the lint data
from the second parse is redundant.
However, while ParsoidCachePrewarmJob and RevisionDataUpdates get
triggered together on edits, they also get triggered separately:
ParsoidCachePrewarmJob by page views with parser cache misses; and
RevisionDataUpdates when pages get invalidated due to template changes.
Because ParsoidCachePrewarmJob and RevisionDataUpdates generally get
triggered in different situations, it seems cleaner to keep the two
mechanisms independent of each other, and live with the duplicate parse
on edit.
Bug: T361013
Change-Id: If53841ee583ce240dd245d640b9ea9c97e1eaa55
Add support for new parsoid lint, missing-image-alt-text
Matches on images that don't have an alt text attribute at all
(empty alt attributes count as present).
Intended to make it easier to put workflows around these images,
including streamlined workflows for "microcontributions" in the
mobile apps.
As this has some impedence mismatch with usage of Special:LintErrors
this is marked as hidden (priority=none) so will not be displayed by
default, but has an enum value reserved for it and can be queried
explicitly.
Bug: T344378
Change-Id: I38cc1abbece3cca8155bec1f071b854027be0966
This reverts commit e2c7746818.
Reason for revert: Culprit is actually on the Parsoid side (reporting a new linter type before the Linter extension knows about it, transiently during deploy) and this revert makes that problem worse not better.
Change-Id: Ib0c1fab8b8e9536e90591a58da673931f5bddf4c
* Seems reasoable to enable this category, it has been in the
database and accessible through Quarry reports for months.
Bug: T341369
Change-Id: Id80a0c02b5948ba9bdc56e762b781d764844afcb
- divide the 'missing-end-tag' into 2 categories:
* 'missing-end-tag': kept the same ( minus the new category results)
* 'missing-end-tag-in-heading': high priority
depends-on: I8397a24e85ca9f5a9ce6413dec5efa8c401a9960
Bug: T308398
Change-Id: I5738abd522bf5248e4b7b1255920055182e6261f
Adding support for categories non, allowing access to category page without showing it in listing, in categories in teh special page
Bug: T334527
Change-Id: I8397a24e85ca9f5a9ce6413dec5efa8c401a9960
* Tag and Template search is enabled using config variable
'LinterUserInterfaceTagAndTemplateStage' and also checks for
the linter table column 'linter_tag' to exist to protect the
report code from error if the column is absent. As the linter
table alter maintenance added both the linter_tag and
linter_template at the same time, there is no reason to check
both. The user interface code does not check for the field
presence only the config variable.
* This code depends on the recordLintJob code writing the tag
and template data which is enabled by the config variable
'LinterWriteTagAndTemplateColumnsStage' and also assumes the
data migration maintenance script migrateTagTemplate.php has
been run to populate linter error records created prior to
the table alter and the write code being enabled.
Bug: T175177
Change-Id: I2f951dfcd34e3dc6ca17e8754cfaeba8baa3e835
* This performance improvement patch uses the namespace from the
new field 'linter_namespace' in the linter table instead of
the 'page_namespace' in the page table. It checks for and
requires the presence of the linter_namespace field in the
linter table, as well as the config variable
'LinterUseNamaspaceColumnStage' being set true.
* If the linter_namespace field is present and aforementioned
config variable is true, the code assumes that the config
variable 'LinterWriteNamespaceColumnStage' is set true and
recording the linter_namespace for new lint errors is
active and the migrateNamespace.php migrate code has been
run to migrate the page_namespace data into existing linter
records that were created before the linter_namespace column
existed and were left NULL during the table alter maintenance
operation.
* A follow on patch should remove the configuration variables and
conditional code producing the final, refactored code dependent
on the new namespace column index.
Bug: T299612
Change-Id: I4a1497d9e4dcd6a9a7befdaccf3e34c61694365d
The action=record-lint was a hack that allowed Parsoid/JS to send data
to MediaWiki to be stored in the linter database. Thankfully we no
longer need it in the glorious Parsoid/PHP world because it can write
directly to the database.
The API module, i18n messages and $wgLinterSubmitterWhitelist are all
removed.
Bug: T329992
Change-Id: Iba70e05a2e28f4ecd02101cff51993ebe65f19d0
* Having separate config variables to enable the maintenance
migrateNamespace and migrateTagTemplate scripts is duplicitous
and should be shared with the write enable config variables.
Bug: T329342
Change-Id: I4cb453fc0678b065cb42a2ca59863da1ab9cdbe4
action=info has a summary table of number of lint errors by category,
but we have richer information available via Special:LintErrors. If
there is a "Lint errors" section, provide a link below the table to
Special:LintErrors for the errors on this page.
Update ApiRecordLint for the new Hooks constructor and leave a FIXME
to eliminate the coupling.
Bug: T301374
Change-Id: Ic1fcf42b50d1392ac53201ceb256691133cf62ff
This hook is not allowed to have a service, so before we can add
services to the main Hooks class, it needs to be split out.
Change-Id: Ia7b4b8bf7c91ebb851c5de9f0f54f56b0993bf83
* The migrate code is designed to perform a one-time update of
linter_params JSON encoded template and tag information into
the new discrete template and tag text fields for use as
additional search criteria. The function can be restarted if
it is interrupted.
* It now uses configurable batching and sleep times between
batches to allow the database to do other work and replication
to occur without stressing infrastructure.
* The migrate code is only called by test code and needs to be
called one-time from a maintenance script.
Bug: T175177
Change-Id: Idc4ca88d4762bc7a3bcbc4e66c0f275562083867
* Migrates namespace info from the page tables page_namespace field
to the new linter table field linter_namespace. This duplication
of the namespace value was requested to greatly reduce the amount
of database activity required by the linter search and reporting
code.
* This patch has been prepared as a dark launch patch enabled with
config value LinterMigrateNamespaceStage and assumes that the
Linter table has had the linter_namespace column added to it,
and recording of the namespace field is already enabled and is
populating the namespace column.
* The migrate code now runnable from Linter/maintenance directory,
using migrateNamespace.php, which will be deployed in a separate
patch. The maintenance code creates an appropriate environment
to call migrateNamespace( in Database.php.
Bug: T299612
Change-Id: I73cb80729d6a5a8716fe93164ad1e42e6958d672