This will decouple a bit the huge and chaotic tangle of AF classes. Some
boilerplate code for AbuseFilter services is also added with this patch.
Note that this requires injecting a KeywordsManager in
AbuseFilterVariableHolder, or unit tests would fail. This is still
incomplete, and the Manager is only injected in tests, because
VariableHolder still has to be refactored.
The test for the UpdateVarDumps script had to be updated, because
serializing VHs in there was a bad choice. As pointed out in a comment,
the test is likely going to break again once we remove the BC code, but
I hope that we'll be able to remove the test at that point.
Change-Id: I12a656a310adb8c5f75cab63f6db9e121e109717
These methods had no reals reason to be static and belong to the
AbuseFilter class. Most of them were moved to Parser class as common
variations of the existing entry points. One was specific to the
EvalExpression API module and was moved there.
This change comes at no cost, and will make it possible to inject a
parser where needed.
Change-Id: Ifd169cfc99df8a5eb4ca94ac330f301ca28a2442
Also fix the test for _first_contributor vars to cover all variables,
use builtin methods to compute the var (rather than calling
setLazyLoadVar), and to be an integration test.
Change-Id: I2594439acc786e31bce1cd4373d3cbf434204eda
While checking a filter, if a variable is not set (e.g. added_lines for
an account creation), the VariableHolder will return a DNULL, rather
than a DUNDEFINED. This means that some filters will resume working, and
the WMF servers will stop getting AF warnings at a rate of 4 millions per
day. This also requires adjusting some tests to reflect the new
behaviour (which is actually the OLD behaviour, that filters had until
last year when we introduced the DUNDEFINED data type). It also requires
adjusting a check in the old parser, but that's not really relevant
because the plan is to remove the old parser before 1.36 is released
(see I0e75f334c7e0dfc1239f2e5f5f7d7452b0bbf29e).
Bug: T230256
Change-Id: I4d06303047397674c1edbfc32628f1bc83ac3340
The problem is explained at T250570#6068702; basically, the previous
check didn't account for DUNDEFINED nested deep inside arrays.
Bug: T250570
Change-Id: Iacee2db54ca00108de6339bb3dae70af7e2eeb56
Using var_export for better visual effect, especially for arrays.
The result from /tools is much clearer and the 'wrong syntax' message is
a bit more explicative than before.
Bug: T190653
Bug: T239972
Change-Id: I79a17305c7f19f7900f896f895e9365bb5f2fd58
This was used to dynamically generate *_restriction_* variables.
However, it had two big problems:
- We only have i18n for 'create', 'move', 'edit', and 'upload' (the
default value of the global); other restrictions would show missing
messages in various pages.
- We had to access the global state in various points.
This change also makes some code in AbuseFilterVariableHolder simpler,
and also allows us to make AbuseFilterTest a unit test.
Change-Id: I321ad6e07f8243200af67a581b6e485970efd3ce
Also fix a couple of broken tests in Consequences:
- For createaccount, $user->addToDatabase must be called before
testForAccountCreation, or it will throw a CannotCreateActorException.
- In testThrottleLimit, also set wgAbuseFilterEmergencyDisableThreshold
to avoid relying on the local config.
Bug: T201193
Change-Id: If1a50b0a729e4d554485f2e2225d5877510966b6
Even if the array is DUNDEFINED, we need to check the offset to ensure
that it's valid.
Bug: T237351
Change-Id: Ibfa360c4ae1d80abe14d9fdf66991b76cb5954df
For the new parser, xhgui shows that AbuseFilterParser::getVarValue is
taking up a lot of time; in turn, most of the time spent inside
getVarValue is used to log the use of deprecated variables. Hence, given
that:
- We should keep the new parser performant
- There are tons of deprecated variables out there and they likely
won't be replaced
- Having gazillions of debugLog entries doesn't help
log them only in the cached phase.
Bug: T234427
Change-Id: I2bfc692c829c3cbe889e5076f5205e2c99097087
This is especially useful for old patches, created before the
introduction of FUNC_ARG_COUNT, where a rebase may break the parser.
Change-Id: Ib142438626a7305f102dc3e4cc9cb07ad33902b8
This bumps the level to WARN, and makes it very clear that people should
fix the affected filters. It also removes the calling method, which was
mostly meant for debugging purposes, and changes the type to 'op_type'
to avoid conflicting with type:mediawiki in logstash.
Bug: T156096
Change-Id: Ie73f1604e8ed82bc2e1be9fc90fa065be37889a3
Always run the keyword/function handler, even if there are DUNDEFINED
arguments, so that the handler can perform further validation on the
input and report any error to the user. However, replace DUNDEFINED with
DNULL before running the handler, to avoid special-casing DUNDEFINED in
every handler. If any argument was a DUNDEFINED, we will return
DUNDEFINED anyway.
Also centralize the keyword handling logic to a new method, like it
happens for functions.
Bug: T234339
Change-Id: I875cb77418a39790e91fe5867c49917bfe406ed4
This emits its own error because:
1- It's clearer to understand
2- It's easier to find where we're dealing with negative offsets, if
we'll ever want to allow that.
Note that trying to use a negative index already results in a hard PHP
error being thrown.
Bug: T237219
Change-Id: Ib11eaaca5e21f740269141c75e62bac48093e8d0
This is similar to the old parser: when discarding a node, actually
evaluate it if short-circuit is not allowed.
Add a whole lot of tests for all possible exceptions.
Move the logic to extract a message from an AFPUserVisibleException away
from the parser, to keep unit tests working.
Bug: T232498
Change-Id: I31ee4e255c6a87dd693b9bcd582539fdf57acd45
This implements T230982#5475400, and it should speed up the CachingParser by roughly 40%.
Bug: T230982
Change-Id: I803cc58637d50eb90e57decf243f5ca78075d63d
This is because there are many filters using this feature. Moreover, it
could make it a little easier to add new arguments, just like dangling
commas in PHP arrays do.
Also re-align the CachingParser code of doLevelFunctions to the one in
the old Parser.
Bug: T153251
Change-Id: Ie4325159f47310788da57415a5e36e62aa4efad0
This will help mitigating problems like T230256 by enforcing that the
requested variables must exist. For now, it will only log bad usages,
thus providing a way to identify affected filters and fix them.
Bug: T230256
Change-Id: I7a61916576e444a56f0e07da7b6e5033346226bd
Using `new LanguageEn()` involved a global, so use a MockObject instead.
Also fix LoggerFactory usage in Tokenizer to use DI instead.
Change-Id: I94d03f9459ab6444e239386eb96a0c2434bfe3dc
IMHO these can be considered unit tests; they were already fast, but now
they're executed in an instant.
This requires several changes: 1 - delay retrieving messages in
AFPUserVisibleException, to avoid having to deal with i18n whenever we
want to test exceptions; 2 - Use some DI for Parser and Tokenizer.
Equivset-dependend tests are also moved to a new class, thus helping to
fix the AF part of T189560.
Change-Id: If4585bf9bb696857005cf40a0d6985c36ac7e7a8
Some of these are transformed into real unit tests, while the
AbuseFilterSaveTest class is refactored to avoid using the DB and to use
a lot more of mocks and DI.
Depends-On: I22743557e162fd23b3b4e52951a649d8c21109c8
Change-Id: Id8412e2b8a4e873fd4821ecc1a3c95710be9a870