Commit graph

1454 commits

Author SHA1 Message Date
Daimona Eaytoy b2dc2c4dd8 Refactor ParserStatus
ParserStatus is now more lightweight, and doesn't know about "result"
and "from cache". Instead, it has an isValid() method which is merely a
shorthand for checking whether getException() is null.

Introduce a child class, RuleCheckerStatus, which knows about result and
cache and can be (un)serialized.

This removes the ambiguity of the $result field, and helps the
transition to a new RuleChecker class.

Change-Id: I0dac7ab4febbfdabe72596631db630411d967ab5
2021-09-17 11:25:54 +00:00
Daimona Eaytoy 742cc865ad Bump EditStashCache version
I0a30e044877c6c858af3ff73f819d5ec7c4cc769 added a new param to
ParserStatus.

Bug: T291123
Change-Id: Ie82d01d85a189081b45a1d34a0f5390536163ee4
2021-09-15 21:17:16 +02:00
Daimona Eaytoy 7c26c4b8d5 More cleanup for parser-related classes
Change-Id: I6a2bbf519e1d5c6fe2778f69624bd80b9ea1ef86
2021-09-10 12:50:20 +00:00
Daimona Eaytoy a722dfe1a4 Rename ParserFactory -> RuleCheckerFactory
The old parser now has the correct name "Evaluator", so the
ParserFactory name was outdated. Additionally, the plan is to create a
new RuleChecker class, acting as a facade for the different
parsing-related stages (lexer, parser, evaluator, etc.), which is what
most if not all callers should use. The RuleCheckerFactory still returns
a FilterEvaluator for now.
Also, "Parser" is a specific term defining *how* things happen
internally, whereas "RuleChecker" describes *what* callers should expect
from the new class.

Change-Id: I25b47a162d933c1e385175aae715ca38872b1442
2021-09-08 21:59:34 +02:00
Daimona Eaytoy 357ddd498c Clean up / simplify parser-related classes
Remove unnecessary setters, injecting everything in the constructor.
These were leftovers from before the introduction of ParserFactory.
Remove public access to the conds used, include the information inside
the returned ParserStatus instead, and consequently simplify callers.

Change-Id: I0a30e044877c6c858af3ff73f819d5ec7c4cc769
2021-09-08 13:41:52 +02:00
Daimona Eaytoy f8e9ac7e2a Rename AbuseFilterCachingParser -> FilterEvaluator
It's an evaluator, not a parser.

Change-Id: Ib6d33e8423ea72709cf5a33f4397ba33e352ea80
2021-09-08 13:40:47 +02:00
libraryupgrader 2a4860e322 build: Updating mediawiki/mediawiki-phan-config to 0.11.0
Change-Id: I097d051e3c30e61d74a8e329b6110b219c72ec1a
2021-09-07 19:30:42 -07:00
Daimona Eaytoy 6684ea6450 Remove AFPTransitionBase
Also cleanup the mPos hack in the CachingParser.

Change-Id: Ib5693802a3ceb80cb736880ed65e27340abef689
2021-09-06 19:33:48 +00:00
jenkins-bot 199cf1edf8 Merge "Add a static analyzer for the filter language" 2021-09-03 19:51:58 +00:00
Matěj Suchánek 0af21948fc Replace WikiPage::factory in non-test code
Change-Id: I1442ca6603ce5151b98fc88cd84c25af0f34e4f6
2021-09-01 04:55:25 +00:00
Daimona Eaytoy 86257d825c tests: Use DBConnRef, not IDatabase, as retval of getConnectionRef
So that the method can be typehinted in core.

Also add phan-var to fix broken master build due to typehint additions
in core.

Change-Id: I4a072e00ffeeb437753fc3d3c1f15de9929df510
2021-08-31 21:45:10 +02:00
Sorawee Porncharoenwase 320e3d696f Add a static analyzer for the filter language
This commit adds a class AFPSyntaxChecker which can statically analyze
a filter code to detect the following errors:

- unbound variables (which comes in two modes: conservative and liberal,
  default to conservative)
- unused variables (disabled by default for compatibilty)
- assignment on built-in identifiers
- function application's arity mismatch
- function application's invalid function name
- non-string literal in the first argument of set / set_var

The existing parser and evaluator are modified as follows:

- The new (caching) evaluator no longer needs to perform variable
  hoisting at runtime.
  - Note that for array assignment, this changes the semantics.
- The new parser is more lenient, reducing parsing errors.
  The static analyzer will catch these errors instead, allowing us
  to give a much better error message and reduces the complexity of
  the parser.
  * The parser now allows function name to be any identifier.
  * The parser now allows arity mismatch to occur.
  * The parser now allows the first argument of set to be any expression.

Concretely, obvious changes that users will see are:

1. a := [1]; false & (a[] := 2); a[0] === 1

   would evaluate to true, while it used to evaluate to the undefined value
   due to hoisting

2. f(1)

   will now error with 'f is not a valid function' as opposed to
   'Unexpected "T_BRACE"'

3. length

   will now error with 'Illegal use of built-in identifier "length"'
   as opposed to 'Expected a ('

Appendix: conservative and liberal mode

The conservative mode is completely compatible with the current evaluator.
That is,

false & (a := 1); a

will not deem `a` as unbound, though this is actually undesirable because
`a` would then be bound to the troublesome undefined value.

The liberal mode rejects the above pattern by deeming `a` as unbound.
However, it also rejects

true & (a := 1); a

even though (a := 1) is always executed. Since there are several filters
in Wikimedia projects that rely on this behavior, we default the mode
to conservative for now.

Note that even the liberal mode doesn't really respect lexical scope
appeared in some other programming languages (see also T234690).
For instance:

(if true then (a := 1) else (a := 2) end); a

would be accepted by the liberal checker, even though under lexical scope,
`a` would be unbound. However, it is unlikely that lexical scope
will be suitable for the filter language, as most filters in
Wikimedia projects that have user-defined variable do violate lexical scope.

Bug: T260903
Bug: T238709
Bug: T237610
Bug: T234690
Bug: T231536
Change-Id: Ic6d030503e554933f8d220c6f87b680505918ae2
2021-08-31 03:28:24 +02:00
Daimona Eaytoy 704364a5e7 Move parser exceptions to specific namespace and rename them
Create a dedicated "Exception" sub-namespace and remove the "AFP"
prefix, a leftover from the pre-namespace era.

Change-Id: I7e5fded9316d8b7d1628bc1a6ba8b1879ac901e1
2021-08-29 23:38:31 +00:00
Matěj Suchánek 3630bb0a3f Use array_fill_keys() instead of array_flip() if that reflects the developer's intention
Do what Tim Starling did in core: If8d340a8bc816a15afec37e64f00106ae45e10ed.

Change-Id: Ic68e167e51ff8d289a0dab68874191b9b1a20665
2021-08-24 01:08:13 +00:00
jenkins-bot 9b93b0256a Merge "Avoid passing invalid offset to mb_strpos" 2021-08-18 18:45:12 +00:00
Daimona Eaytoy e9795468c4 Switch filterable actions hooks to the new system
Bug: T261067
Bug: T211680
Change-Id: I0e7e4a48b56c3e5fde56f50693fd0cdc19c30dd0
2021-08-16 14:18:56 +00:00
Alexander Vorwerk 8e7d389029 Disallow interwiki on Special:AbuseLog
Bug: T288155
Depends-On: Ic00f4a0f27747b5ff0893b4c01f42f68a99771ab
Change-Id: I62574460bfaea04af2f617ca0929246c784cb4e8
2021-08-05 11:15:39 +02:00
jenkins-bot ca31a12be4 Merge "Clean up Throttle::throttleIdentifier" 2021-07-30 01:37:24 +00:00
Matěj Suchánek 83794d7cb4 Clean up Throttle::throttleIdentifier
In 1.37, UserEditTracker was changed to allow anonymous users
as well.

Change-Id: I70d9e6db13416b7c017319ecac3e7e604aacd586
2021-07-22 16:56:12 +02:00
Lucas Werkmeister a2e42d5050 Don’t generate current content text twice
Previously, for non-newly-created pages, AbuseFilter would get the text
for filtering twice: once in AbuseFilterHooks::filterEdit(), and then
again in RunVariableGenerator::getEditTextForFiltering(). (Plus another
call for the text of the previous revision.) The first copy of the text
is only passed into RunVariableGenerator::getEditVars(), and there only
used if the title doesn’t exist, otherwise it’s overwritten with the
second copy. Instead, let’s make AbuseFilterHooks not get the text at
all, and only get the text from the content when we actually need it
(the content is new).

Change-Id: Id12430fa6ba4643113b945e0d0c01b9c0ee1742f
2021-07-22 13:45:32 +02:00
libraryupgrader 5377ebe819 build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 36.0.0 → 37.0.0

npm:
* postcss: 7.0.35 → 7.0.36
  * https://npmjs.com/advisories/1693 (CVE-2021-23368)

Change-Id: I2b382f3bb236fb44eb24c6a257b13b8fd886541c
2021-07-21 18:51:18 +00:00
jenkins-bot edaf650151 Merge "Revert "Replace depricating method IContextSource::getWikiPage to WikiPageFactory usage"" 2021-07-04 06:17:09 +00:00
DannyS712 3f4430473e Revert "Replace depricating method IContextSource::getWikiPage to WikiPageFactory usage"
This reverts commit 15fc159cb1.

Reason for revert: this is breaking the addition of rev ids to filter
hits after edits are saved. I suspect this is because the context wikipage
is for a different title than the one being edited, though I'm not sure
way - regardless, testing on patchdemo shows that with this revert
is applied, rev ids are once again added to filter hits.

Bug: T286140
Change-Id: I3ab6324a73050154cef1c20a2bf8307eb11eea2d
2021-07-04 05:54:30 +00:00
jenkins-bot db15b13396 Merge "SECURITY: Avoid database for MediaWiki:Abusefilter-blocker fallback" 2021-07-02 15:46:53 +00:00
Daimona Eaytoy 069fa064f5 Avoid passing invalid offset to mb_strpos
Bug: T285978
Change-Id: I3d100fd05f34fe3b01ecbbce5361badc613f9406
2021-07-02 14:07:46 +00:00
jenkins-bot 39dfd40abc Merge "ViewImport/ViewList: Use setTitle instead of addHiddenField/setAction" 2021-07-02 12:27:55 +00:00
Kosta Harlan 833aa70f10 ViewImport/ViewList: Use setTitle instead of addHiddenField/setAction
Bug: T285464
Change-Id: I3845f3261373d2aa3318ab39d125210f64f65447
2021-07-02 13:18:01 +02:00
DannyS712 71bf9faf49 SECURITY: Avoid database for MediaWiki:Abusefilter-blocker fallback
If the content language is English and the message is invalid as
a username, or the content language is not English and both the
content language version and the English version are invalid, the
user in FilterUser would not be created - now, avoid the onwiki
version of the English message in the fallback, so it could only
be invalid if the default in the i18n files was invalid.

Bug: T284364
Change-Id: I9e9f44b7663e810de70fb9ac7f6760f83dd4895b
2021-07-01 17:35:54 -05:00
jenkins-bot 2deac909ad Merge "Pass a user to WikiPage::prepareContentForEdit()" 2021-06-28 22:50:38 +00:00
Roman Stolar 15fc159cb1 Replace depricating method IContextSource::getWikiPage to WikiPageFactory usage
Bug: T275710
Change-Id: I7fe24059e9909352e95aaa82fb48688f9260b207
2021-06-28 16:12:48 +03:00
jenkins-bot 97f805b67c Merge "Bump MW requirement to 1.37" 2021-06-26 14:19:01 +00:00
jenkins-bot eb24f02c25 Merge "Handle EditFilterMergedContent hook properly to break hook chains and display error message" 2021-06-26 12:21:55 +00:00
Daimona Eaytoy e56dcc7cb1 Bump MW requirement to 1.37
The master version of the extension is only meant to support the most
recent version of MediaWiki.

Change-Id: I33612e69fc37bf5eb70133c8f0e95199dd7fcb65
2021-06-26 14:18:43 +02:00
DannyS712 47f861b6f6 Pass a user to WikiPage::prepareContentForEdit()
Bug: T285447
Change-Id: I4d277419106c3af5222377a863c80dd866ba188b
2021-06-24 04:01:33 +00:00
jenkins-bot 4dd9644bf6 Merge "Make phan not complain about Throttle::throttleIdentifier" 2021-06-22 12:07:22 +00:00
Matěj Suchánek d7ec0b992c Make phan not complain about Throttle::throttleIdentifier
UserEditTracker::getUserEditCount now allows anonymous users,
but it returns null and phan is aware of this. Suppress this
warning until at least 1.37 is required.

Change-Id: I9962abe08fa31d55421d8bdda23ea0a1c0471a86
2021-06-22 11:37:58 +02:00
jenkins-bot 52024847e8 Merge "Pass a valid regexp to preg_match in checkRegexMatchesEmpty" 2021-06-04 09:11:50 +00:00
jenkins-bot 997e665530 Merge "Don't use p class="success" for success messages" 2021-06-04 08:59:58 +00:00
Daimona Eaytoy 57f11631ba Pass a valid regexp to preg_match in checkRegexMatchesEmpty
Bug: T283966
Change-Id: I99688aa8f3e62e410392a9142df56b1a3c708987
2021-05-29 11:38:07 +00:00
Umherirrender 360d41c8ec Replace uses of DB_MASTER with DB_PRIMARY
Change-Id: I60719654b2062bbe52d2eadef8b942cea477e522
2021-05-13 01:43:37 +02:00
Tim Starling 2c939e28a9 Move onUserMergeAccountFields to its own file
Sharing a handler class with UserRenameHandler means that attempting to
merge users fails due to a missing interface if AbuseFilter and MergeUser
are installed but Renameuser is not installed.

Change-Id: I1244ab1c446840ff2648248f943d7fc784b889a7
2021-05-06 11:33:24 +10:00
libraryupgrader 06cdddc9d0 build: Updating composer dependencies
* mediawiki/mediawiki-codesniffer: 35.0.0 → 36.0.0
* php-parallel-lint/php-parallel-lint: 1.2.0 → 1.3.0

Change-Id: I92d6f6d6f817765df24f845103a489624f4290f2
2021-05-02 06:41:54 +00:00
Umherirrender 1fa7a83f60 Use static closures where safe to use
Created by I25a17fb22b6b669e817317a0f45051ae9c608208

Change-Id: I533690311ca559685de8a4bf123348c9bcfa5931
2021-04-30 20:55:35 +02:00
mainframe98 a32d483ef4 Don't use p class="success" for success messages
These are part of legacy styles and aren't provided by all skins.
Using Html::successbox abstracts the classes away.
Internally that uses div class="successbox" instead.

Bug: T280766
Change-Id: I0cca59e2f391510095c2c6fb187ace5e91fdde8b
2021-04-30 18:19:31 +00:00
Ammarpad 6a799ec9c5 Check forcing of page_timestamp revision index
Bug: T270033
Change-Id: I16fc273b14e7f4b00e8c31ec1ed7712149aafe37
2021-04-30 13:06:43 +01:00
Daimona Eaytoy c091a2f749 Fix MySQL db patches compatibility
Follow-up I574bda15f0f5c92a7d97a6e3150981b8f97ee7fc
Apologies for not noticing before:

If somebody hadn't already added the afl_filter_id column, the
rename-indexes patch would try to rename a non-existing index
(filter_timestamp_full and fail). So put rename-indexes after the other
patch.
Then, for the afl_filter_id patch, check the column and not the index.
We were checking the index because it's the last thing that the DB patch
does (so if the index is found, we can be certain that the patch was
fully applied). However, now that renaming the index happens afterwards,
if somebody had already added afl_filter_id (with the old index name),
running the updater would try adding it again, because the new index
name isn't found (as it's renamed later).

Change-Id: I0250a7c187202facd932c160ace57930db510f64
2021-04-25 11:28:35 +02:00
jenkins-bot 4e7e2f6c64 Merge "Give MySQL indexes explicit names, align MySQL and SQLite" 2021-04-25 08:50:08 +00:00
Func 351f9f02bc Handle EditFilterMergedContent hook properly to break hook chains and display error message
Extensions are supposed to return false to break hook chains when failed, which can avoid unnecessary call of later handlers in other extensions and work around with problems caused by difference betwen multiple triggers.

On mediawiki version 1.36 and before, just returning false in this hook can't display error message by default.
Set $status->value manually still to provide backward compatibility.

Bug: T280312
Change-Id: I78888247063c726ebcd18ba54a21d6c7891481fc
2021-04-24 02:02:01 +00:00
jenkins-bot ffe3b0cbc4 Merge "Clean up AbuseFilterViewHistory and AbuseFilterHistoryPager" 2021-04-19 14:37:00 +00:00
jenkins-bot ec804600c6 Merge "Stop using legacy ActorMigration fields" 2021-04-19 14:36:58 +00:00