ParserStatus is now more lightweight, and doesn't know about "result"
and "from cache". Instead, it has an isValid() method which is merely a
shorthand for checking whether getException() is null.
Introduce a child class, RuleCheckerStatus, which knows about result and
cache and can be (un)serialized.
This removes the ambiguity of the $result field, and helps the
transition to a new RuleChecker class.
Change-Id: I0dac7ab4febbfdabe72596631db630411d967ab5
The old parser now has the correct name "Evaluator", so the
ParserFactory name was outdated. Additionally, the plan is to create a
new RuleChecker class, acting as a facade for the different
parsing-related stages (lexer, parser, evaluator, etc.), which is what
most if not all callers should use. The RuleCheckerFactory still returns
a FilterEvaluator for now.
Also, "Parser" is a specific term defining *how* things happen
internally, whereas "RuleChecker" describes *what* callers should expect
from the new class.
Change-Id: I25b47a162d933c1e385175aae715ca38872b1442
Remove unnecessary setters, injecting everything in the constructor.
These were leftovers from before the introduction of ParserFactory.
Remove public access to the conds used, include the information inside
the returned ParserStatus instead, and consequently simplify callers.
Change-Id: I0a30e044877c6c858af3ff73f819d5ec7c4cc769
So that the method can be typehinted in core.
Also add phan-var to fix broken master build due to typehint additions
in core.
Change-Id: I4a072e00ffeeb437753fc3d3c1f15de9929df510
This commit adds a class AFPSyntaxChecker which can statically analyze
a filter code to detect the following errors:
- unbound variables (which comes in two modes: conservative and liberal,
default to conservative)
- unused variables (disabled by default for compatibilty)
- assignment on built-in identifiers
- function application's arity mismatch
- function application's invalid function name
- non-string literal in the first argument of set / set_var
The existing parser and evaluator are modified as follows:
- The new (caching) evaluator no longer needs to perform variable
hoisting at runtime.
- Note that for array assignment, this changes the semantics.
- The new parser is more lenient, reducing parsing errors.
The static analyzer will catch these errors instead, allowing us
to give a much better error message and reduces the complexity of
the parser.
* The parser now allows function name to be any identifier.
* The parser now allows arity mismatch to occur.
* The parser now allows the first argument of set to be any expression.
Concretely, obvious changes that users will see are:
1. a := [1]; false & (a[] := 2); a[0] === 1
would evaluate to true, while it used to evaluate to the undefined value
due to hoisting
2. f(1)
will now error with 'f is not a valid function' as opposed to
'Unexpected "T_BRACE"'
3. length
will now error with 'Illegal use of built-in identifier "length"'
as opposed to 'Expected a ('
Appendix: conservative and liberal mode
The conservative mode is completely compatible with the current evaluator.
That is,
false & (a := 1); a
will not deem `a` as unbound, though this is actually undesirable because
`a` would then be bound to the troublesome undefined value.
The liberal mode rejects the above pattern by deeming `a` as unbound.
However, it also rejects
true & (a := 1); a
even though (a := 1) is always executed. Since there are several filters
in Wikimedia projects that rely on this behavior, we default the mode
to conservative for now.
Note that even the liberal mode doesn't really respect lexical scope
appeared in some other programming languages (see also T234690).
For instance:
(if true then (a := 1) else (a := 2) end); a
would be accepted by the liberal checker, even though under lexical scope,
`a` would be unbound. However, it is unlikely that lexical scope
will be suitable for the filter language, as most filters in
Wikimedia projects that have user-defined variable do violate lexical scope.
Bug: T260903
Bug: T238709
Bug: T237610
Bug: T234690
Bug: T231536
Change-Id: Ic6d030503e554933f8d220c6f87b680505918ae2
Create a dedicated "Exception" sub-namespace and remove the "AFP"
prefix, a leftover from the pre-namespace era.
Change-Id: I7e5fded9316d8b7d1628bc1a6ba8b1879ac901e1
Previously, for non-newly-created pages, AbuseFilter would get the text
for filtering twice: once in AbuseFilterHooks::filterEdit(), and then
again in RunVariableGenerator::getEditTextForFiltering(). (Plus another
call for the text of the previous revision.) The first copy of the text
is only passed into RunVariableGenerator::getEditVars(), and there only
used if the title doesn’t exist, otherwise it’s overwritten with the
second copy. Instead, let’s make AbuseFilterHooks not get the text at
all, and only get the text from the content when we actually need it
(the content is new).
Change-Id: Id12430fa6ba4643113b945e0d0c01b9c0ee1742f
This reverts commit 15fc159cb1.
Reason for revert: this is breaking the addition of rev ids to filter
hits after edits are saved. I suspect this is because the context wikipage
is for a different title than the one being edited, though I'm not sure
way - regardless, testing on patchdemo shows that with this revert
is applied, rev ids are once again added to filter hits.
Bug: T286140
Change-Id: I3ab6324a73050154cef1c20a2bf8307eb11eea2d
If the content language is English and the message is invalid as
a username, or the content language is not English and both the
content language version and the English version are invalid, the
user in FilterUser would not be created - now, avoid the onwiki
version of the English message in the fallback, so it could only
be invalid if the default in the i18n files was invalid.
Bug: T284364
Change-Id: I9e9f44b7663e810de70fb9ac7f6760f83dd4895b
The master version of the extension is only meant to support the most
recent version of MediaWiki.
Change-Id: I33612e69fc37bf5eb70133c8f0e95199dd7fcb65
UserEditTracker::getUserEditCount now allows anonymous users,
but it returns null and phan is aware of this. Suppress this
warning until at least 1.37 is required.
Change-Id: I9962abe08fa31d55421d8bdda23ea0a1c0471a86
Sharing a handler class with UserRenameHandler means that attempting to
merge users fails due to a missing interface if AbuseFilter and MergeUser
are installed but Renameuser is not installed.
Change-Id: I1244ab1c446840ff2648248f943d7fc784b889a7
These are part of legacy styles and aren't provided by all skins.
Using Html::successbox abstracts the classes away.
Internally that uses div class="successbox" instead.
Bug: T280766
Change-Id: I0cca59e2f391510095c2c6fb187ace5e91fdde8b
Follow-up I574bda15f0f5c92a7d97a6e3150981b8f97ee7fc
Apologies for not noticing before:
If somebody hadn't already added the afl_filter_id column, the
rename-indexes patch would try to rename a non-existing index
(filter_timestamp_full and fail). So put rename-indexes after the other
patch.
Then, for the afl_filter_id patch, check the column and not the index.
We were checking the index because it's the last thing that the DB patch
does (so if the index is found, we can be certain that the patch was
fully applied). However, now that renaming the index happens afterwards,
if somebody had already added afl_filter_id (with the old index name),
running the updater would try adding it again, because the new index
name isn't found (as it's renamed later).
Change-Id: I0250a7c187202facd932c160ace57930db510f64
Extensions are supposed to return false to break hook chains when failed, which can avoid unnecessary call of later handlers in other extensions and work around with problems caused by difference betwen multiple triggers.
On mediawiki version 1.36 and before, just returning false in this hook can't display error message by default.
Set $status->value manually still to provide backward compatibility.
Bug: T280312
Change-Id: I78888247063c726ebcd18ba54a21d6c7891481fc