Commit graph

180 commits

Author SHA1 Message Date
rarohde d022377578 Merge global profiling keys
The last step of the profiling overhaul. See T53294 for the original description by Dragons flight.

Note: Here I'm adding a FixMe for a problem which already exists in the code
and the child patch will fix it.

Bug: T53294
Depends-On: I2d8c8f8278073a9420e3eb373fb89a655925618a
Change-Id: Ib12e072a245fcad93c6c6bd452041f3441f68bb7
2019-08-04 17:59:58 +00:00
Daimona Eaytoy c3db63714e Use milliseconds for time profiling
Instead of seconds, and round the average condition at 1dp instead of 0.
Split from child patch by Dragons flight.

Depends-On: I2d8c8f8278073a9420e3eb373fb89a655925618a
Change-Id: I339aed5f8c1d49714e7927ce49286f9ce6c839f5
2019-08-03 23:24:46 +00:00
Daimona Eaytoy 0b7902fe6e Move per-filter matches profiling to per-filter data
They're currently stored separately, so move matches count together with
other per-filter data to keep it consistent. This also removes a
parameter from filterMatchesKey, as it's not needed anymore.
Split from child patch by Dragons flight.

Bug: T53294
Depends-On: I8f47beb73cfc1b63c4b3c809fc6d65a1e66ee334
Change-Id: I2d8c8f8278073a9420e3eb373fb89a655925618a
2019-08-03 23:22:20 +00:00
Daimona Eaytoy a85e1ccc59 Make AbuseFilterParser::$funcCache non-static
Change-Id: I312efe3ce4d1f06e697aa4564aeec1bacbaf97d3
2019-08-03 09:19:49 +00:00
Daimona Eaytoy 09d0254172 Better handling of DNONE
This patch includes:
 * Making it possible to access offsets of a DNONE (returning a DNONE)
 * Initializing user-defined variables as DNONE inside short-circuited branches
 * Make DNONE propagate with other operators
 * Make DNONE count as false for logic operators
 * Remove a now-outaded bit in doLevelAtom. In case of shortcircuit,
   $result is now DNONE instead of DNULL, and thus it's possible to
   access offsets of it. Performance++!
 * Don't allow modifying or adding an element of a DNONE as if it were an
    array (to avoid inconsistencies)

This re-applies Id85c673337fa90a3782fd22eb9690cd996967111 with several fixes.

NOTE: Haven't tested locally, although I'm pretty confident thanks to
the amount of tests added.

Bug: T214674
Bug: T228677
Change-Id: I5ec4ab44c4e88aaf18c0d7b73355d27050beeda7
2019-08-02 21:05:08 +00:00
jenkins-bot e3e157361d Merge "Revert "Initialize user-defined variables during shortcircuit"" 2019-07-29 23:30:50 +00:00
Daimona Eaytoy 13cdb86dd2 Revert "Initialize user-defined variables during shortcircuit"
Reason for revert: T214674#5374806

This reverts commit 56e6117afd.

Bug: T214674
Change-Id: Iccce248d2693cd9877a740b74e72a577e730435e
2019-07-29 23:06:23 +00:00
Daimona Eaytoy 4720c97530 Add a new class for methods related to running filters
Currently we strongly abuse (pardon the pun) the AbuseFilter class: its
purpose should be to hold static functions intended as generic utility
functions (e.g. to format messages, determine whether a filter is global
etc.), but we actually use it for all methods related to running filters.
This patch creates a new class, AbuseFilterRunner, containing all such
methods, which have been made non-static. This leads to several
improvements (also for related methods and the parser), and opens the
way to further improve the code.
Aside from making the code prettier, less global and easier to test,
this patch could also produce a performance improvement, although I
don't have tools to measure that.
Also note that many public methods have been removed, and almost any of
them has been made protected; a couple of them (the ones used from outside)
are left for back-compat, and will be removed in the future.

Change-Id: I2eab2e50356eeb5224446ee2d0df9c787ae95b80
2019-07-23 19:06:27 +00:00
Daimona Eaytoy 56e6117afd Initialize user-defined variables during shortcircuit
Bug: T214674
Depends-On: I5a14d4b2bc3ffd9caaaa095f16f36b9b6009db05
Change-Id: Id85c673337fa90a3782fd22eb9690cd996967111
2019-07-23 12:20:53 +00:00
Daimona Eaytoy 9937f8b050 Remove extra file from parser tests
Added in I5a14d4b2bc3ffd9caaaa095f16f36b9b6009db05, but .r files aren't
used anymore since I6c06e596587750c4ebaabafbd277bc75eeb436a5, and I
forgot to remove the file upon rebasing.

Change-Id: Id688d215b1136bd0a04b8c0d8d8d16de5da1295e
2019-07-15 12:22:09 +02:00
Daimona Eaytoy 18d7d2ed62 Start using AFPData::DNONE
This should allow more flexibility when checking syntax, and a saner
behaviour overall.
Aside from not throwing exception in certain cases, the results should
be almost equal to the ones you would get without this patch. However,
there are still a few things to improve (which for convenience I wrote
inside the parser test) and many to test.

Bug: T204654
Depends-On: I69bfec45c76509fb1112641393f78e8d8834adcd
Change-Id: I5a14d4b2bc3ffd9caaaa095f16f36b9b6009db05
2019-07-14 08:48:47 +00:00
Daimona Eaytoy 7bc566e635 Fix the regex for numbers, start deprecation of non-decimal numbers
Aside from the 14 thingy reported in the task, this syntax is awful! The
fix to the regex should only be intended as a temporary stopgap. A
proper fix would be to introduce a new syntax, like for instance the one
used in PHP.

Bug: T212726
Change-Id: Idc37a17ce539e6c63d67fc07d47d812569debe0e
2019-07-10 13:26:36 +00:00
jenkins-bot 6f0905541a Merge "Make AbuseFilterVariableHolder::mVars private" 2019-07-09 08:42:16 +00:00
jenkins-bot 69bebbb4ff Merge "Simplify action arrays" 2019-07-08 23:07:26 +00:00
Daimona Eaytoy 304b58d46a Make AbuseFilterVariableHolder::mVars private
This property is meant to be private, since it has all kinds of
getters/setters, aside from one which is introduced in this patch.

Change-Id: I217b1e22cabd3c0468c84b1d6a69a6ed3c6fa8e6
2019-07-08 16:25:10 +02:00
Daimona Eaytoy d8d4750e6a Simplify action arrays
The current form is awkward. They're all like
[ actionname => [ 'action' => actionname, 'parameters' => params ] ]
This is greatly confusing since adds a nesting level, and just
duplicates the actionname information (also, we actually never retrieve
it from the internal array). Instead, change all of them to be
[ actionname => params ]
which is a lot shorter and clearer (and easier to handle).
A similar case is handled in I8134ecc41fbecdbed99faf406e9e3ca91b6123b9
(see PS 8..10).

Change-Id: I34c040dbeb3ab01158fb3db22496def6ccaf72d9
2019-07-05 10:00:48 +02:00
Aaron Schulz 2cf7b58434 Convert wfGetDB() calls to using getConnectionRef()
This handles the logic of calling reuseConnection() automatically

Change-Id: I9328e709fe5d81099338a31deef24d34db22d784
2019-07-04 15:09:32 -07:00
Daimona Eaytoy 7398730563 Disallow consecutive comparisons
As explained on phabricator, they don't work with shortcircuit, so they
already fail for all filters using them. Plus IMHO it's an unnecessary
deviation from PHP's behaviour, given that this syntax doesn't do what
users may expect.

Bug: T218906
Change-Id: If9e7545e14044c8dc3b4163bb6fca8ab0683b9fa
2019-07-04 19:15:07 +02:00
Daimona Eaytoy e86d4bc124 Simplify code for stashedEdits tests
Using the new PageEditStash class allows to simplify a bit the
integration tests for edit stashing. As I wrote in a ToDo, it may be
enough to manually run the hook, but that's left to do as a follow-up.

Change-Id: I3389a6961b4f39ecd980be2f429c23f8b7706a15
2019-06-24 11:13:59 +02:00
Daimona Eaytoy 382751a707 Move conditions-related stuff inside AbuseFilterParser
Instead of relying on static methods and members in the AbuseFilter
class, move everything related to conditions inside the Parser, as the
amount of used conditions is something pertaining a single
AbuseFilter(Caching)Parser instance.
This change requires changing some signatures and adding parameters,
but will make introducing the new AbuseFilterRunner class easier (and
that will clean signatures, too).

Depends-On: I5b29ff556eca45fe59d15e2e3df4d06f1f6b3934
Change-Id: I7c1ea17adf7f42cf9260d416906bfbf3b8a20688
2019-06-19 15:14:17 +00:00
petarpetkovic c02590f555 Fix "succesful" typo
Change-Id: Ibd92f6de8b03098e7bdc8c4fc5e3f6cfaba95bdf
2019-06-14 03:08:41 +03:00
Daimona Eaytoy e7cd4b2a98 Rewrite AbuseFilter::decodeGlobalName
Now it returns an array with a bit more info, and has a different name
to reflect the fact that its input is now split in two parts. Plus, make
it throw whenever it gets an unexpected input, and add a bunch of test
cases for it.

Depends-On: Ib5fdeb75c1324f672b4ded39681f006fde34b4d1
Change-Id: Ie550889495232b534c0f9aec31039cf21b2135b1
2019-06-12 23:56:25 +00:00
Thalia 22ceae7e23 Use MediaWiki\Block\DatabaseBlock instead of Block
This follows the rename of the Block class in I6d96b63ca0.

Change-Id: I44cf9eb68c23a8299316effa4dee7f732486dd84
2019-05-31 16:08:19 +01:00
jenkins-bot 369ce36be7 Merge "Tokenizer caching back to APC" 2019-05-28 19:10:22 +00:00
Daimona Eaytoy 53f03e5301 Tokenizer caching back to APC
Partial revert of I4dd81a723e2bdb828b90594ad66a3918d8ec5b6c.
Thinking again of it, I think it's not worth it to have this data over
the network. Plus, given that it's not-that-slow to be computed, I think
there can only be a performance gain in using APC (as opposed to e.g.
memcached/redis) for 99.9% of the filters.

Change-Id: I8c6a4a95ec12c18ede8e6419540f7a2ac943457c
2019-05-28 19:48:26 +02:00
jenkins-bot 112787020d Merge "Support for PermissionManager changes at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/502484" 2019-05-28 11:24:53 +00:00
Vedmaka f293b3b7be Support for PermissionManager changes at
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/502484

Change-Id: I258f02e286b6ba0387e1bff540a744fafb03dc55
2019-05-28 09:33:28 +00:00
Daimona Eaytoy 4b5c7c0198 Add missing covers/group tags
Including CachingParser as follow-up of
I980aec3481a52ecc35f1811a366014a5581a7cdb.

Bug: T201193
Change-Id: I9905efb2b2e61b330c42275c9ccfab2f24750bd4
2019-05-25 14:10:07 +02:00
Daimona Eaytoy 39fc7c12af Restore unit tests for CachingParser and fix it
Added cachingParser back to *all* the parser tests, fixed a couple of
differences with the normal parser, and added a couple of tests so that
any cachingParser-related file has 100% coverage. Also move the remaining
get_matches tests inside parserTests, and specify the parser used in case of failure.
This also adds a new base class for parser-related tests with a couple
of util methods.

Bug: T201193
Change-Id: I980aec3481a52ecc35f1811a366014a5581a7cdb
2019-05-25 10:55:24 +02:00
jenkins-bot 1cb80be0ad Merge "Add tests for various data type casts" 2019-05-24 19:19:20 +00:00
jenkins-bot 058e215882 Merge "Refactor tokenizer caching" 2019-05-24 19:09:03 +00:00
Daimona Eaytoy f56562f583 Add tests for global filters
Another crucial part to have covered. Also clarify that
AbuseFilterCentralDB can be of the form "dbname-prefix".

Remove a filter used for profiling and replace it with a global one:
we're still fine, and the list is kept shorter.

Bug: T201193
Depends-On: I5ee7ba44a6cd82a5ddb24fb4127af04d96e647f4
Change-Id: If6b91711534c0d60e1aa27bd5748c3023e29f376
2019-05-24 16:58:23 +02:00
Daimona Eaytoy b3707106e9 Reset MWTimestamp in tearDown
Follow-up of I5ee7ba44a6cd82a5ddb24fb4127af04d96e647f4.

Change-Id: Icf288d7c4a9d087e7e1cd8a6e8c8cc9dac20e532
2019-05-24 16:54:29 +02:00
Daimona Eaytoy a766e39ade Add unit tests for profiling
Yet another important part to have covered. While for normal edits it
already works, for stashed ones it doesn't. That's why we need the patch
for checkAllFilters. Since for stashed edits profiling stats are all
zeros, this may explain T201334.
Changed the timestamp variable to use wfTimestamp instead of time() so
that we can fake it inside unit tests.
In a subsequent patch we should add average runtime conditions to tests
(really tricky).

Bug: T201193
Depends-On: Ib17821240b25c972a187e6b5eae42c5ada6c65e7
Change-Id: I5ee7ba44a6cd82a5ddb24fb4127af04d96e647f4
2019-05-23 08:47:40 +00:00
Daimona Eaytoy 00b9791349 Add unit tests for stashed edits
This is an important part to cover, and should be further expanded.
Also, fix a couple of minor things around, including making some methods
non-static.

Bug: T201193
Depends-On: I5e35d773904a62105767ce6d7d962ab5525c2d12
Change-Id: Ib17821240b25c972a187e6b5eae42c5ada6c65e7
2019-05-23 08:47:25 +00:00
jenkins-bot c52850aae7 Merge "Add missing limits to explode() calls" 2019-05-15 15:06:18 +00:00
Thiemo Kreuz c6f20a64dd Add missing limits to explode() calls
This is fixing potential bugs where invalid strings with more than one
comma have silently been accepted.

Change-Id: Ib1e7d0c99973f243ef6faad6389bab688187c1cf
2019-05-15 16:14:12 +02:00
Thiemo Kreuz fa3ce90851 Remove comments literally repeating what the code says
I find it obvious that a file called "AbuseFilterTokenizerTest" is a
"test for the AbuseFilterTokenizer class". A comment that is just
repeating this information is typicalls not helpful, but distracting
and a potential source of mistakes, e.g. when stuff is copy-pasted,
but the comment not adjusted.

Change-Id: I1d4cc06e9e5631955ff73bf675090cf9c33c9390
2019-05-15 16:04:32 +02:00
Thalia f23905c402 Remove call to deprecated User::isBlocked
Change-Id: Ibb7412f8aa08a745a211b9b0581ccb6b0ca9eff5
2019-05-14 13:14:57 +01:00
Daimona Eaytoy 2276d8ed2a Refactor tokenizer caching
Split a method, use WAN cache so that we're enabled to use
getWithSetCallback, pass the "version" option there and adapt the test
to it.
Follow-up of I9b3bc36b552901bc6ca7609ee51e80be2979a9c4

Change-Id: I4dd81a723e2bdb828b90594ad66a3918d8ec5b6c
2019-04-23 19:38:10 +02:00
jenkins-bot 968bd9b817 Merge "Add tests for tokenizer caching" 2019-04-17 23:27:19 +00:00
Aryeh Gregor b222330a61 Don't try to move onto an existing page in tests
I didn't fix every case where this happens, just what blocks
I6ddcc9f34a48f997ae39b79cd2df40dd2cc10197 from landing.

Change-Id: I971e619eb76c4474fe037fad258f9c496717bf41
2019-04-17 17:23:23 +03:00
Daimona Eaytoy 4b10a544ab Add tests for tokenizer caching
Caching the result of the tokenization is pretty important
performance-wise, so this test ensures that caching works as expected.
I have also extracted the method used to generate the cache key for
easier testing, and moved the cache instance to a class member because
otherwise that piece of code can't be tested...

Bug: T201193
Change-Id: I9b3bc36b552901bc6ca7609ee51e80be2979a9c4
2019-04-15 16:59:55 +02:00
Daimona Eaytoy ec110c657b Add tests for various data type casts
These are the ones which other tests don't cover, mostly because no
filter syntax can trigger those cases. This patch should bring coverage
for AFPData to 100%.

Bug: T201193
Change-Id: I997576141943959d4602a9f839311108928ec766
2019-04-14 14:08:57 +02:00
Daimona Eaytoy 909eec6716 Tweak coverage part 2
Follow-up of Ic30883f7d261d974a2be46308d023e2714119e95, with two files
that I forgot to git-add and a repositioning of comments to avoid the
last bracket to be reported as uncovered.

Bug: T201193
Change-Id: I6bf7e5892a0f49f6a138792f0aedf230a70c18a8
2019-04-13 19:26:01 +02:00
Daimona Eaytoy 4bcb64b01a Increase code coverage a bit
This patch mostly adds coverageIgnore comments for intendedly
unreachable code etc. Some of them could be made testable by adding a new
filter function (e.g. array cast), but this patch is meant to be
comment-only (aside from the parser test).
Ignoring coverage for these lines makes some methods reach 100%
coverage, which in turn makes it easier to look at the coverage chart
and identify at a glance which parts of the code *really* need to be
covered.

Bug: T201193
Change-Id: Ic30883f7d261d974a2be46308d023e2714119e95
2019-04-13 18:30:14 +02:00
Daimona Eaytoy 8293ec176f Add tests for storing and loading the variables dump
These are specific tests for storeVarDump and loadVarDump, both alone
and in the context of running filters.
Also, include disabled variables in the VariableHolder object if they're
saved in the DB.

Bug: T201193
Depends-On: Ia5c477edc8733bb1994cb6d01e1371ed496c8bcb
Change-Id: I5e35d773904a62105767ce6d7d962ab5525c2d12
2019-04-12 08:03:33 +00:00
jenkins-bot c0da9ff3ac Merge "Clean AbuseFilterParserTests" 2019-04-11 21:46:50 +00:00
Brad Jorsch b59f19d675 AbuseFilterTest: Don't use $wgUser when creating pages
Which means we have to pass a user to WikiPage::doEditContent().

Follows up Ifbcd9adf3.

Change-Id: I1bd0288cc132627d75b4001219522ec5e952eda7
2019-04-09 12:25:34 -04:00
jenkins-bot cc670f0a07 Merge "Clean the AbuseFilterTest class" 2019-04-06 14:47:52 +00:00