This allows a little bit more of abstraction: we can store other data in the
tree, without having to store it in a specific node (e.g. the variables map,
which is still unused). It also adds a few typehints, and specializes
the return value of eval'ing the AST: previously, it was the one of
evalNode, which wasn't guaranteed to be an AFPData. Now we have this
guarantee. Last but not least, we can now measure runtime metrics for
evalTree, which doesn't recurse.
Bonus: fix a check in the old parser, which used the wrong variable when
reporting outofbounds errors.
Change-Id: Iff806793b1d968e9bb6220f1459f3d0ac587c7da
And fix a couple of minor bugs.
Bug: T156096
Depends-On: I3b85087677607573f4fa68681735dc35348dcd87
Change-Id: Ia4c713a1d45827f6a8bc5566a8d8835c49f8108a
Ensure that the variable isn't set before marking it as DUNDEFINED:
that's only for when we cannot use a default, but if the variable is set
we already have one. Most notably, this fixes conditionals handling: right
now, if you have a conditional with an assignment in both
branches, the variable will be undefined. That's obviously wrong, so
it's fixed in this patch.
Plus: catch only AFPExceptions in a test to avoid unintentionally
catching the assert exception; simplify some assignments using wfSetVar.
Depends-On: I446a307e5395ea8cc8ec5ca5d5390b074bea2f24
Change-Id: I8e7f7710b8cb37ada8531b631456a3ce7b27ee45
This patch includes various fixes to how func arguments are handled in
CachingParser:
- Add a comment about a future improvement of checkSyntax, which we
could limit to try building the AST.
- Having enough args for each function is now also checked when
building the AST. This allows implementing the previous point without
stopping to report notenoughargs at syntaxcheck-time (otherwise it'd be
a runtime error). And it also ensure that we check for the params count
inside skipped branches, e.g. inside if/else: these were already only
discovered at runtime in CachingParser. The old parser is not affected
by this change, because when checking syntax it will always execute
all branches, and at runtime it will skip braces altogether.
- Fix arg count for CachingParser, which previously added a bogus param
in case of a function called without parameters. This was fixed for
the other parser in I484fe2994292970276150d2e417801453339e540, and I
just ported the updated fix. Also note that the CachingParser was
already failing for e.g. `count()`, but instead of complaining about
missing arguments, it failed hard when trying to pass NULL to
evalNode.
- Fixed some tests not to use setExpectedException, which caused the
previous point to remain unnoticed: calling that method prevents the
loop from continuing, and thus only the AbuseFilterParser part was
being executed. The new implementation checks the exception ID and is
thus more future-proof if the i18n message changes.
- Fixed some function names in error reporting for the old parser.
- The arg count is now checked outside of the function handlers, thus
it's no more necessary to call checkEnoughArguments at the beginning
of each handler. This also produces clearer error messages in case of
aliases (e.g. set/set_var).
- Check the args count even if some of the args are DUNDEFINED. This is
much easier now that the check is outside of the handler. This will
make syntax check fail for e.g. `contains_any(added_lines)`.
Bug: T156095
Change-Id: I446a307e5395ea8cc8ec5ca5d5390b074bea2f24
The regression itself was fixed in
I980aec3481a52ecc35f1811a366014a5581a7cdb, so this patch only adds a
test for it.
Also remove a comment about CachingParser failures: we don't want to
encourage people to remove it from tests anymore.
Bug: T152281
Change-Id: I3ad49050ea49bf45d3226878e091da3c8dbefdb1
Just like we do for functions, it doesn't really make sense to have
keywords separately, in AFPData.
Change-Id: I208a9b1ce2bd12038e9fbcc515c48d604ec80eb8
This is more complicated than the := operator, because the var name
could be a complicated expression, and we have to handle a function
call. This patch only covers the case where the variable name is a
literal, which is enough for WMF production.
Bug: T214674
Change-Id: I6c0f8e95663919a0235b5ccf0c88ad0a539315a7
As for all mostly unused consequences, blockautopromote has a couple of
major problems: first, it blocked the status for a random time between 3
and 7 days, which to me makes no sense at all (is it some sort of
casino?), and this patch fixes it to 5 days. Second, nothing was logged,
not the blocking nor the unblocking. Here I'm adding a LogHandler for
two new sub-actions of 'rights' to keep track of both action.
Bug: T49412
Change-Id: If48a48f5b8baaf9e77c0826466f5d03bb7f691d0
The last step of the profiling overhaul. See T53294 for the original description by Dragons flight.
Note: Here I'm adding a FixMe for a problem which already exists in the code
and the child patch will fix it.
Bug: T53294
Depends-On: I2d8c8f8278073a9420e3eb373fb89a655925618a
Change-Id: Ib12e072a245fcad93c6c6bd452041f3441f68bb7
I5ec4ab44c4e88aaf18c0d7b73355d27050beeda7 almost fixed this bug, but we
also have to make it possible to access builtin variables as arrays.
This will only make sense for a few variables (e.g. added_lines and
removed_lines), but I don't think we should validate it when checking
syntax.
Bug: T198531
Change-Id: I417e1b8d4802bbfccd091ce5c7617659cfd1e4ea
Instead of seconds, and round the average condition at 1dp instead of 0.
Split from child patch by Dragons flight.
Depends-On: I2d8c8f8278073a9420e3eb373fb89a655925618a
Change-Id: I339aed5f8c1d49714e7927ce49286f9ce6c839f5
They're currently stored separately, so move matches count together with
other per-filter data to keep it consistent. This also removes a
parameter from filterMatchesKey, as it's not needed anymore.
Split from child patch by Dragons flight.
Bug: T53294
Depends-On: I8f47beb73cfc1b63c4b3c809fc6d65a1e66ee334
Change-Id: I2d8c8f8278073a9420e3eb373fb89a655925618a
This patch includes:
* Making it possible to access offsets of a DNONE (returning a DNONE)
* Initializing user-defined variables as DNONE inside short-circuited branches
* Make DNONE propagate with other operators
* Make DNONE count as false for logic operators
* Remove a now-outaded bit in doLevelAtom. In case of shortcircuit,
$result is now DNONE instead of DNULL, and thus it's possible to
access offsets of it. Performance++!
* Don't allow modifying or adding an element of a DNONE as if it were an
array (to avoid inconsistencies)
This re-applies Id85c673337fa90a3782fd22eb9690cd996967111 with several fixes.
NOTE: Haven't tested locally, although I'm pretty confident thanks to
the amount of tests added.
Bug: T214674
Bug: T228677
Change-Id: I5ec4ab44c4e88aaf18c0d7b73355d27050beeda7
Currently we strongly abuse (pardon the pun) the AbuseFilter class: its
purpose should be to hold static functions intended as generic utility
functions (e.g. to format messages, determine whether a filter is global
etc.), but we actually use it for all methods related to running filters.
This patch creates a new class, AbuseFilterRunner, containing all such
methods, which have been made non-static. This leads to several
improvements (also for related methods and the parser), and opens the
way to further improve the code.
Aside from making the code prettier, less global and easier to test,
this patch could also produce a performance improvement, although I
don't have tools to measure that.
Also note that many public methods have been removed, and almost any of
them has been made protected; a couple of them (the ones used from outside)
are left for back-compat, and will be removed in the future.
Change-Id: I2eab2e50356eeb5224446ee2d0df9c787ae95b80
Added in I5a14d4b2bc3ffd9caaaa095f16f36b9b6009db05, but .r files aren't
used anymore since I6c06e596587750c4ebaabafbd277bc75eeb436a5, and I
forgot to remove the file upon rebasing.
Change-Id: Id688d215b1136bd0a04b8c0d8d8d16de5da1295e
This should allow more flexibility when checking syntax, and a saner
behaviour overall.
Aside from not throwing exception in certain cases, the results should
be almost equal to the ones you would get without this patch. However,
there are still a few things to improve (which for convenience I wrote
inside the parser test) and many to test.
Bug: T204654
Depends-On: I69bfec45c76509fb1112641393f78e8d8834adcd
Change-Id: I5a14d4b2bc3ffd9caaaa095f16f36b9b6009db05
Aside from the 14 thingy reported in the task, this syntax is awful! The
fix to the regex should only be intended as a temporary stopgap. A
proper fix would be to introduce a new syntax, like for instance the one
used in PHP.
Bug: T212726
Change-Id: Idc37a17ce539e6c63d67fc07d47d812569debe0e
This property is meant to be private, since it has all kinds of
getters/setters, aside from one which is introduced in this patch.
Change-Id: I217b1e22cabd3c0468c84b1d6a69a6ed3c6fa8e6
The current form is awkward. They're all like
[ actionname => [ 'action' => actionname, 'parameters' => params ] ]
This is greatly confusing since adds a nesting level, and just
duplicates the actionname information (also, we actually never retrieve
it from the internal array). Instead, change all of them to be
[ actionname => params ]
which is a lot shorter and clearer (and easier to handle).
A similar case is handled in I8134ecc41fbecdbed99faf406e9e3ca91b6123b9
(see PS 8..10).
Change-Id: I34c040dbeb3ab01158fb3db22496def6ccaf72d9
As explained on phabricator, they don't work with shortcircuit, so they
already fail for all filters using them. Plus IMHO it's an unnecessary
deviation from PHP's behaviour, given that this syntax doesn't do what
users may expect.
Bug: T218906
Change-Id: If9e7545e14044c8dc3b4163bb6fca8ab0683b9fa
Using the new PageEditStash class allows to simplify a bit the
integration tests for edit stashing. As I wrote in a ToDo, it may be
enough to manually run the hook, but that's left to do as a follow-up.
Change-Id: I3389a6961b4f39ecd980be2f429c23f8b7706a15
Instead of relying on static methods and members in the AbuseFilter
class, move everything related to conditions inside the Parser, as the
amount of used conditions is something pertaining a single
AbuseFilter(Caching)Parser instance.
This change requires changing some signatures and adding parameters,
but will make introducing the new AbuseFilterRunner class easier (and
that will clean signatures, too).
Depends-On: I5b29ff556eca45fe59d15e2e3df4d06f1f6b3934
Change-Id: I7c1ea17adf7f42cf9260d416906bfbf3b8a20688
Now it returns an array with a bit more info, and has a different name
to reflect the fact that its input is now split in two parts. Plus, make
it throw whenever it gets an unexpected input, and add a bunch of test
cases for it.
Depends-On: Ib5fdeb75c1324f672b4ded39681f006fde34b4d1
Change-Id: Ie550889495232b534c0f9aec31039cf21b2135b1
Partial revert of I4dd81a723e2bdb828b90594ad66a3918d8ec5b6c.
Thinking again of it, I think it's not worth it to have this data over
the network. Plus, given that it's not-that-slow to be computed, I think
there can only be a performance gain in using APC (as opposed to e.g.
memcached/redis) for 99.9% of the filters.
Change-Id: I8c6a4a95ec12c18ede8e6419540f7a2ac943457c
Added cachingParser back to *all* the parser tests, fixed a couple of
differences with the normal parser, and added a couple of tests so that
any cachingParser-related file has 100% coverage. Also move the remaining
get_matches tests inside parserTests, and specify the parser used in case of failure.
This also adds a new base class for parser-related tests with a couple
of util methods.
Bug: T201193
Change-Id: I980aec3481a52ecc35f1811a366014a5581a7cdb
Another crucial part to have covered. Also clarify that
AbuseFilterCentralDB can be of the form "dbname-prefix".
Remove a filter used for profiling and replace it with a global one:
we're still fine, and the list is kept shorter.
Bug: T201193
Depends-On: I5ee7ba44a6cd82a5ddb24fb4127af04d96e647f4
Change-Id: If6b91711534c0d60e1aa27bd5748c3023e29f376
Yet another important part to have covered. While for normal edits it
already works, for stashed ones it doesn't. That's why we need the patch
for checkAllFilters. Since for stashed edits profiling stats are all
zeros, this may explain T201334.
Changed the timestamp variable to use wfTimestamp instead of time() so
that we can fake it inside unit tests.
In a subsequent patch we should add average runtime conditions to tests
(really tricky).
Bug: T201193
Depends-On: Ib17821240b25c972a187e6b5eae42c5ada6c65e7
Change-Id: I5ee7ba44a6cd82a5ddb24fb4127af04d96e647f4
This is an important part to cover, and should be further expanded.
Also, fix a couple of minor things around, including making some methods
non-static.
Bug: T201193
Depends-On: I5e35d773904a62105767ce6d7d962ab5525c2d12
Change-Id: Ib17821240b25c972a187e6b5eae42c5ada6c65e7
This is fixing potential bugs where invalid strings with more than one
comma have silently been accepted.
Change-Id: Ib1e7d0c99973f243ef6faad6389bab688187c1cf
I find it obvious that a file called "AbuseFilterTokenizerTest" is a
"test for the AbuseFilterTokenizer class". A comment that is just
repeating this information is typicalls not helpful, but distracting
and a potential source of mistakes, e.g. when stuff is copy-pasted,
but the comment not adjusted.
Change-Id: I1d4cc06e9e5631955ff73bf675090cf9c33c9390
Split a method, use WAN cache so that we're enabled to use
getWithSetCallback, pass the "version" option there and adapt the test
to it.
Follow-up of I9b3bc36b552901bc6ca7609ee51e80be2979a9c4
Change-Id: I4dd81a723e2bdb828b90594ad66a3918d8ec5b6c
I didn't fix every case where this happens, just what blocks
I6ddcc9f34a48f997ae39b79cd2df40dd2cc10197 from landing.
Change-Id: I971e619eb76c4474fe037fad258f9c496717bf41
Caching the result of the tokenization is pretty important
performance-wise, so this test ensures that caching works as expected.
I have also extracted the method used to generate the cache key for
easier testing, and moved the cache instance to a class member because
otherwise that piece of code can't be tested...
Bug: T201193
Change-Id: I9b3bc36b552901bc6ca7609ee51e80be2979a9c4
These are the ones which other tests don't cover, mostly because no
filter syntax can trigger those cases. This patch should bring coverage
for AFPData to 100%.
Bug: T201193
Change-Id: I997576141943959d4602a9f839311108928ec766
Follow-up of Ic30883f7d261d974a2be46308d023e2714119e95, with two files
that I forgot to git-add and a repositioning of comments to avoid the
last bracket to be reported as uncovered.
Bug: T201193
Change-Id: I6bf7e5892a0f49f6a138792f0aedf230a70c18a8
This patch mostly adds coverageIgnore comments for intendedly
unreachable code etc. Some of them could be made testable by adding a new
filter function (e.g. array cast), but this patch is meant to be
comment-only (aside from the parser test).
Ignoring coverage for these lines makes some methods reach 100%
coverage, which in turn makes it easier to look at the coverage chart
and identify at a glance which parts of the code *really* need to be
covered.
Bug: T201193
Change-Id: Ic30883f7d261d974a2be46308d023e2714119e95
These are specific tests for storeVarDump and loadVarDump, both alone
and in the context of running filters.
Also, include disabled variables in the VariableHolder object if they're
saved in the DB.
Bug: T201193
Depends-On: Ia5c477edc8733bb1994cb6d01e1371ed496c8bcb
Change-Id: I5e35d773904a62105767ce6d7d962ab5525c2d12
If the User passed to $logEntry->setPerformer() represents a creatable
username, then it has to actually exist so the actor row can be created.
Bug: T188327
Change-Id: Iab2fc9593a020ffacd219d644103d685028e3336
Mostly delete result files and assume the result is always true. The few
exceptions were either moved to standalone test, or inverted.
Change-Id: I6c06e596587750c4ebaabafbd277bc75eeb436a5
The reasoning is similar to the one of the parent patch (Ia5c477edc8733bb1994cb6d01e1371ed496c8bcb). Plus, it records runtime metrics on action different than edits, as there's no reason not to do it.
No performance issues in production.
Bug: T191039
Depends-On: Ia5c477edc8733bb1994cb6d01e1371ed496c8bcb
Change-Id: Ib1112e2fefd0631550d386ba87e5f87db84c3036
This variable was introduced to selectively enable profiling because
stats recording was bad for performance. Nowadays, stats are recorded in
a deferredupdate and don't harm performance anymore. Thus, this variable
can be removed and profiling be enabled by default.
Bug: T191039
Depends-On: Ib5fdeb75c1324f672b4ded39681f006fde34b4d1
Change-Id: Ia5c477edc8733bb1994cb6d01e1371ed496c8bcb
Follow-up of I1721a3ba532d481e3ecf35f51099c1438b6b73b2. This is the only
wrong replacement: strict checking will let 5 / 0.0 pass, with
unexpected results. Adding a regression test for it, too.
Change-Id: I25dbe9fafa92fd9a11bd8bc6ab8e66f305b8d48e
Since double-equals are evil. I left some of them in place where I
wasn't sure, but I may be changed some which were intended to be
doubles. It could be a good idea to delay merging this patch until we'll
have more code coverage.
Change-Id: I1721a3ba532d481e3ecf35f51099c1438b6b73b2
If "tag" option is selected and the form is submitted without adding any
tag, just show it blank instead of adding an empty tag to the topbar.
Separately validate the empty tag case (and added a test for it).
Bug: T203353
Depends-On: I3b2e763bd8835207dc5df1db43d3e1881e6961c3
Change-Id: I8884b739fd17fa2eace5aac8775d3524aa606f1f
Adding PHPdocs to every class members, in every file. This patch only
touches comments, and moved properties on their own lines. Note that
some of these properties would need to be moved, somehow changed, or
just removed (either because they're old, unused leftovers, or just
because we can move them to local scope), but I wanted to keep this
patch doc-only.
Change-Id: I9fe701445bea8f09d82783789ff1ec537ac6704b
Remove all globals, make methods non-static, improve assertions and
computing some variables, add names to the tests and other minor
improvements.
Change-Id: Ifbcd9adf34d173d0da0aa568fc6f91fdc2d61609
Where prevents is used as a setter, use the new setter methods;
where it is used to determine whether a block blocks the target
from editing their talk page, use appliesToUsertalk.
Block::prevents was deprecated and replaced by several other
methods in I0e131696419211.
Bug: T211578
Change-Id: I166cc6f64c0f895ff8c631d2655c1c3208131371
The @expectedException annotation got deprecated in PHPUnit 7.5, and
removed in PHPUnit 8.0. This was done because the annotation does have
two disadvantages:
* The class name is encoded in string, where it is not easy to find for
all IDEs and tools.
* it did not allow to say exactly *when* the exception is expected.
Change-Id: I85f0b5f44b2f400a121115d402b64827ea534c32
Using break could halt parsing between operations, instead use continue
to parse all operations.
Bug: T214642
Change-Id: If67ddaffef280c2448c55ae536013758617bba68
For wgLang, there's a Language object available in the proximity, so just pass it.
For wgContLang, use MediaWikiServices.
Change-Id: Ic492007f2d5eeb8048d0919a4b9b7dd98c15c350