* UI and functionality working, SQL injection risk is handled using
the database wrapper function buildLike()
Bug: T185685
Change-Id: I1f630a83f01a1c93f29643e4fc8baf55391b758d
* Add "mw-blank" as another tag value that erases all lint errors
for a page as a blank page cannot have any lint errors.
Bug: T280193
Change-Id: Iaad8ce75950588b2676de5dfb5f5221d64231f0e
* Determines if new content type is not wikitext and if so
deletes all existing lint error records for that pageID.
Bug: T298343
Change-Id: I20fac9a0c901f3e7a5cc898566a4487fbe70798f
Use WebRequest::getText, which discards array values since
the `pagename` param requires string value.
Bug: T301360
Change-Id: Ifc23a7020e410603ac86b71d7aee2d28f06b9ab2
* Adds template and tag name fields to enable better search
capability to the Linter extension
Bug: T175177
Change-Id: Iac22d99109bb1253f450a64254f50677e3cdefeb
* This patch contains the code needed to update the database with
the addition Linter table column linter_namespace <int/null>
and add an index with category and namespace for faster
selection of linter records.
Bug: T299612
Change-Id: I05da381b9a74294b44d4aef968614277d601a176
Html::openElement checks for this and logs a warning.
Html::closeElement does not, and this makes it return
</span class="error">.
Change-Id: I31c2809d2fd5421606fa877021d1636ac0eb4d26
The article id of the title is set to 0 when the page is deleted so,
although the lint job from the hook runs, it doesn't remove anything.
Reverts most of I06b821b65f65609ddac8ed4e7c662336082d8266
Bug: T298782
Bug: T170313
Change-Id: I2610b9b16d4032b0e18b3537cc9ed51bfdaff299
* Added error message when namespace and/or pagename is not
found or malformed which now displays a bold red message.
Bug: T151362
Change-Id: I70fde2676ded6a112f7f2b07f94f6f4b616f0e39
The global function wfWikiID() is deprecated since 1.35 and it's usages
should be replaced with WikiMap::getCurrentWikiId().
Bug: T298059
Change-Id: I695c20bff266f869f740baf7f3e335b357546fb4
Currently we select 20 rows, and return the accurate count if it's less
than that, so up to 19 rows. Since we want to return an accurate count
if it's 20 rows or less, select one more row, 21, so we can differentiate
between only having 20 result rows or hitting the limit. This is the same
technique used in MediaWiki's Pager system.
Change-Id: I50fa96238eb4c7178414ee92c53799fd69520926
* The code now produces an accurate count if the number of
errors for a category is below the threshold set by a
public constant MAX_ACCURATE_COUNT (currently 20).
The database record count limit was originally set to 1,
to determine accurately, if there were actually 0 errors
in a category as the estimate code would never report 0.
If not 0, it would use the estimated count which does not
produce an accurate count for any other number of errors.
For low error counts this is annoying to editors and
unnecessary. The additional CPU/disk activity to accurately
check for low error counts is not significantly more than
checking for 0 or 1, as checking for 0 likely requires
a complete table scan which is probably expensive compared
to a low count that early outs when it hits to record limit.
* An improvement to consider is recording the accurate count in
a separate tiny table, and maintaining an accurate count there
which is used in preference to doing the select with row limit
based on say a 30 second TTL, to prevent a stampede of requests
from doing extraneous database operations.
* Added unit test coverage for accurately counting low error
conditions that are lower than the threshold and also verify
that the estimate is inaccurate beyond the error count
threshold.
Bug: T194872
Change-Id: I4f74cfe3bf9601baa0dc8fa6464a68030ac2bc4b
Adding a new parameter 'lnttitle'.
The SQL queries should work the same as the ones when using the
existing parameter 'lntnamespace'.
Bug: T254930
Change-Id: Ic34617e2f56d1055388ea6e8a93ff641f0342240
The following sniffs are failing and were disabled:
* MediaWiki.Commenting.FunctionComment.MissingDocumentationPrivate
Additional changes:
* Also sorted "composer fix" command to run phpcbf last.
Change-Id: Icdd0d0e60dd543921a5757162548ae149c3316ea
This eases deployment dependencies by allowing Parsoid to supply an
appropriate database category ID so that new lint categories can be
appropriately stored during the interval between adding a new lint
category to Parsoid and deploying an Extension:Linter patch to
describe it.
Change-Id: Ib7b2342168fa53ca2abac7d5f54fe313be341eb7
We will crash trying to set `templateInfo` if it is present and `params`
is not.
On line 49 of RecordLintJob we're going to use `$errorInfo['params']`,
which will crash if `params` is not present.
Change-Id: I505c676cc0ccd8d54e44e65b04b10c2de03ee37c
If `"has-name": false` is set, then CategoryManager should interpret that
as false, and not treat the existence of the key as making it true.
Same with "parser-migration".
Change-Id: I6b8d5b23fb33707b7596d9172da26fd84d652da8
Originally we were dropping excess events inside the job queue, but that means
all of the events need to be passed into the job queue...which can cause
problems.
So drop them in the API module. The only other place we construct
RecordLintJob is when an article has been deleted, and those jobs have
no errors since they're all being deleted.
Bug: T202179
Change-Id: I61940280e0dfb99398d9f047d0e66007d91a0241
On large wikis with lots of lint errors, counting the entire table can
be problematic from a performance perspective, sometimes taking minutes.
Instead, use Database::estimateRowCount(), which uses EXPLAIN SELECT
COUNT(*) to get an approximate value for the number of rows. We do make
sure that if the category actually has no rows, that it will return 0.
This should be considered a temporary solution, and we should look into
doing something like the SiteStats incremental updates table in the long
run.
Bug: T184280
Change-Id: I2d4dcc615477fd60e41dfed4a3d1a3ad52a9f4af
This reduces the number of different files that need to be modified
in order to add a new linter category.
Change-Id: Id095317d6d761c57e2ce632d34ebd962bf85e785
* multiple-unclosed-formatting-tags: this is a subset of the unclosed
formatting tag lint, but is higher priority because unclosed tags
like <small> and <big> compound their effects
* unclosed-quotes-in-heading: unclosed wikitext i/b tag with a heading
ancestor (this causes breakage which leaks out of the table of
content to affect the rest of the page)
* multiline-html-table-in-list: html table with newline breaks nested
in a list
* misc-tidy-replacement-issues: this is a catchall category for
infrequent long-tail issues, used to enable speedy deployment of new
linter categories during tidy replacement as wikis get RemexHtml
enabled. These will have a subtype property to identify the
specific issue.
Change-Id: Ic2c965132f7a09679574489865bdc81df9e43845
Linter calls the deprecated ApiBase::dieUsage() function in
ApiRecordLint.php, replacing it with dieWithError().
Bug: T181758
Change-Id: I0cbf784f591b86b206b032fccbc0e32564a3e9e8
The pageid parameter limits lint errors to specific pages. Users can get
detailed lint error information of a certain page.
Bug: T181303
Change-Id: I164449254649caff29fcffa3bc7c923c20b8e837
If the user does not have permission to edit the page, still show a
"view source" and history link so they are able to figure out what the
error is.
Bug: T177289
Change-Id: I049d27d37073e452dc0c11128dab5204d110d81f
If a newer version of MediaWiki gets rolled back, it's possible for
there to be lint entries in the database that don't exist according to
CategoryManager.
Instead of showing an error to the user, just silently hide those rows.
All callers to CategoryManager::getCategoryId() already check the
category exists. The callers for CategoryManager::getCategoryName() will
catch the MissingCategoryException, and log it if necessary. Notably
LinterError::makeLintError() will return false on invalid rows, and all
callers have been updated to handle that.
Bug: T179423
Change-Id: Ia5f56f18a51fa871511b02410222a6079efbfff6
* <font> tags with color attribute that wrap links and images
will have different behavior and hence rendering compared to Tidy.
Change-Id: I7a551ef9b7e8f57d7a43c823832f0e3add6b1367
Page deletions were bypassing the logic in RecordLintJob that
ensured the right category totals cache was cleared and the
statsd updates. Fix that by just using RecordLintJob directly.
Bug: T170313
Change-Id: I06b821b65f65609ddac8ed4e7c662336082d8266
CategoryManager is a member variable. Also explicitly define the
haveParserMigrationExt member variable too.
Change-Id: I2aa77e5c1819a80ba18f67c0f65b2d47dbaa0303
* Since the high priority categories are being used to effect
Tidy migration, we need editors to be able to compare their fixes
and ensure that the pages look identical after the fix.
* This patch adds a new property for linter categories and displays
the requested edit link in the pager.
* This patch enables the property for high priority categories.
Change-Id: Ia9b23d79da686f0a6c0203e2dba58a876a4a3d4a
The following sniffs are failing and were disabled:
* MediaWiki.FunctionComment.Missing.Protected
* MediaWiki.FunctionComment.Missing.Public
Change-Id: I96e32df48d13040893bfd1be6d90d0db4f7c7d0a
This module just exposes the number of lint errors in each category for
tools that want to collect statistics and stuff. It's currently under a
'totals' key to give us more flexibility if we want to add other
information in the future.
Change-Id: Iad5136a6a5989ce5bcb1a00a4f82f49a397e0170
The query itself is too expensive to be run on large Wikimedia wikis. So
put it behind WAN cache and touch the check keys for each category
whenever those have errors added or deleted from them.
If this happens to get out of sync, it will get fully refreshed
regularly when the totals are sent to statsd.
WANObjectCache's 'lockTSE' feature will help avoid cache stampedes that
made this query expensive in the past.
Change-Id: I3774103a29fa0f29d36283950f136259fa71bffe
The messaging is a bit tricky. In some cases, the flag is set
because the output involves a single template and some top-level
content -- not multiple templates always.
Ex: '{{1x|<div>}}foo</span>'
In this case, the entire content is considered template-affected
because of DOM structuring reasons. This wikitext will generate
lint errors with multi-part-template-block template info.
In production usage, however, the probability is quite high that
multiple templates will be involved. This is usually because of
tables that are constructed with multiple templates, for example.
So, if we want to fudge and hand-wave a bit, the i18n message can
say that the output came from multiple templates.
Bug: T162920
Change-Id: I35cee6787800b03724856775fdf53991ae8e8125
Displaying categories by priority provides editors with better
guidance about what to spend time on. The Linter help page provides
more information about why the specific priorities have been chosen.
Change-Id: If6f28570189e24a67b4380f666f4cd64a2296989
Register a hook for when VE is finished initializing to select the error
section, just like the textbox-based editor.
Use the BeforePageDisplay hook so it runs on VE page loads too.
Bug: T160102
Change-Id: I59a7e0a3e8be32e4689cbf41c4904970902c4dff
API meta=siteinfo is a frequently called API query
and the category totals database query is not
sufficiently fast enough to be part of this by default.
This reverts commit 45b4bf6382.
Change-Id: Ia5bf97855d48955ce53c3679c4d138fabafae8b7
This way we can track the progress of individual wikis in cleaning up
errors. The wiki name is at the end of the key so we can still use
e.g. "linter.category.$name.*" to see across all wikis at once.
Change-Id: I62463b9256e125d32d97396bd939334d71b46027
Instead of just including the error category names, include the number
of errors each category has, to make it easier to collect aggregate
stats.
This changes the structure from an array to an object, in JSON, but I'm
not aware of any clients using these specific fields yet.
Change-Id: Iaf942b923a0f4047721055ad9cb48aacc5aa6784
Categories can now have a severity level of "error" or "warning"
designated, which places them in a different heading on
Special:LintErrors.
Bug: T152822
Change-Id: I1276b9502d90765e88dcb8ea78569dee910c5d88
The most likely scenario of duplicate key errors is that it's the exact
same lint error and there's just a race condition when calculating which
new errors need to be inserted, so just ignore them.
Follows-up 419610bcdb.
Change-Id: I84749ab221bbd517b474be8875bb6a59e4f3258e
These tests insert variations of fake lint errors into the database, and
then read out of the database to check they round-trip properly.
And while we're at it, improve the setForPage() return value.
These tests can be run with something like:
php tests/phpunit/phpunit.php extensions/Linter/tests/phpunit/
Change-Id: Ifdba8a8a104d218a822f909bc5d7b3512aca499d
Move location to two separate columns in the database: linter_start and
linter_end. This allows us to have the database enforce the uniqueness
of those fields, instead of just relying upon the PHP code to do so,
which could be bypassed since we have multiple servers and concurrent
processes.
Change-Id: I3e67ce1b7cb3c93866a388ec3248af4cff2a81e0
If no errors existed for the page, inserting new ones would fail with a
database error since $errors was key-ed by unique id, which the database
wrapper interpreted as field names, causing issues. Using array_values()
gets rid of the keys, fixing the issue.
Change-Id: I7645de4e5d3ac4462d7980374c8ef8be6280442b
It's possible to have duplicate, identical lint errors if the same exact
error is repeated in a template transclusion (e.g. {{1x|<b/> <b/>}})
since the position via dsr is the same. In this case, just de-duplicate
the errors since we can't differentiate them.
At the same time, trim excessive errors on the same page in the same
category. It's most likely that if a page has that many of the same
errors, the editor or bot will just fix all of them at the same time, so
we don't need to include all of them in the database. 20 is kind of a
low value, but we can always increase it later on as necessary.
Change-Id: I9cded720169870d0eea574e1a930ce4e9b190ac0
* Treat it as an array in all cases
* Make the default value all categories
* Rename from category to categories
Bug: T151288
Change-Id: I5c0c341112894c5a7ec3aaebb6ac9085353f55bd
Since this module is internal, it doesn't really matter, but it cleans
up the output on api.php?modules=record-lint.
Bug: T151285
Change-Id: I859a2780d6ed1918cc81101e9e2c2cd348a2390a
Test plan:
Type "Special:LintErrors/" into the search box, and see that the
autocomplete dropdowns include the subpages for individual error
categories.
Bug: T151289
Change-Id: I919e0e51a3b956f275f9a372b2f2844002972ea7
Instead of having a complex auto-increment category manager that had
additional caching, just hardcode the (currently 6) ids for each
category, which allows us to simplify a lot of code.
If Parsoid sends a lint error in a category that we don't know about, it
is silently dropped.
Bug: T151287
Change-Id: Ice6edf1b7985390aa0c1c410d357bc565bb69108
The job queue will allow us to have better flood control and rate
limiting instead of trying to do all the database writes as soon as
parsoid contacts MediaWiki.
On the downside, this means it may take longer for changes to be
reflected in the database and to users, but we already have no promise
for that, so it seems okay.
Note that if you don't have a job queue runner set up, you'll need to
run the runJobs.php script every time to have the jobs execute.
Change-Id: I25fd54734aca4dab09711e7f6aee027654931300
linter_cat is now an int ID that points to a row in the new
lint_categories table.
The mapping between category names and ids is all handled in PHP, and
cached in APC.
Note that you will need to drop the `linter` table manually and re-run
update.php for this to take effect.
Change-Id: I369d9b4d8d08289b4a20d1cd29a2e327bad28ef8