Commit graph

601 commits

Author SHA1 Message Date
jenkins-bot 1cb80be0ad Merge "Add tests for various data type casts" 2019-05-24 19:19:20 +00:00
jenkins-bot 058e215882 Merge "Refactor tokenizer caching" 2019-05-24 19:09:03 +00:00
Daimona Eaytoy f56562f583 Add tests for global filters
Another crucial part to have covered. Also clarify that
AbuseFilterCentralDB can be of the form "dbname-prefix".

Remove a filter used for profiling and replace it with a global one:
we're still fine, and the list is kept shorter.

Bug: T201193
Depends-On: I5ee7ba44a6cd82a5ddb24fb4127af04d96e647f4
Change-Id: If6b91711534c0d60e1aa27bd5748c3023e29f376
2019-05-24 16:58:23 +02:00
Daimona Eaytoy b3707106e9 Reset MWTimestamp in tearDown
Follow-up of I5ee7ba44a6cd82a5ddb24fb4127af04d96e647f4.

Change-Id: Icf288d7c4a9d087e7e1cd8a6e8c8cc9dac20e532
2019-05-24 16:54:29 +02:00
Daimona Eaytoy a766e39ade Add unit tests for profiling
Yet another important part to have covered. While for normal edits it
already works, for stashed ones it doesn't. That's why we need the patch
for checkAllFilters. Since for stashed edits profiling stats are all
zeros, this may explain T201334.
Changed the timestamp variable to use wfTimestamp instead of time() so
that we can fake it inside unit tests.
In a subsequent patch we should add average runtime conditions to tests
(really tricky).

Bug: T201193
Depends-On: Ib17821240b25c972a187e6b5eae42c5ada6c65e7
Change-Id: I5ee7ba44a6cd82a5ddb24fb4127af04d96e647f4
2019-05-23 08:47:40 +00:00
Daimona Eaytoy 00b9791349 Add unit tests for stashed edits
This is an important part to cover, and should be further expanded.
Also, fix a couple of minor things around, including making some methods
non-static.

Bug: T201193
Depends-On: I5e35d773904a62105767ce6d7d962ab5525c2d12
Change-Id: Ib17821240b25c972a187e6b5eae42c5ada6c65e7
2019-05-23 08:47:25 +00:00
jenkins-bot c52850aae7 Merge "Add missing limits to explode() calls" 2019-05-15 15:06:18 +00:00
Thiemo Kreuz c6f20a64dd Add missing limits to explode() calls
This is fixing potential bugs where invalid strings with more than one
comma have silently been accepted.

Change-Id: Ib1e7d0c99973f243ef6faad6389bab688187c1cf
2019-05-15 16:14:12 +02:00
Thiemo Kreuz fa3ce90851 Remove comments literally repeating what the code says
I find it obvious that a file called "AbuseFilterTokenizerTest" is a
"test for the AbuseFilterTokenizer class". A comment that is just
repeating this information is typicalls not helpful, but distracting
and a potential source of mistakes, e.g. when stuff is copy-pasted,
but the comment not adjusted.

Change-Id: I1d4cc06e9e5631955ff73bf675090cf9c33c9390
2019-05-15 16:04:32 +02:00
Thalia f23905c402 Remove call to deprecated User::isBlocked
Change-Id: Ibb7412f8aa08a745a211b9b0581ccb6b0ca9eff5
2019-05-14 13:14:57 +01:00
Daimona Eaytoy 2276d8ed2a Refactor tokenizer caching
Split a method, use WAN cache so that we're enabled to use
getWithSetCallback, pass the "version" option there and adapt the test
to it.
Follow-up of I9b3bc36b552901bc6ca7609ee51e80be2979a9c4

Change-Id: I4dd81a723e2bdb828b90594ad66a3918d8ec5b6c
2019-04-23 19:38:10 +02:00
jenkins-bot 968bd9b817 Merge "Add tests for tokenizer caching" 2019-04-17 23:27:19 +00:00
Aryeh Gregor b222330a61 Don't try to move onto an existing page in tests
I didn't fix every case where this happens, just what blocks
I6ddcc9f34a48f997ae39b79cd2df40dd2cc10197 from landing.

Change-Id: I971e619eb76c4474fe037fad258f9c496717bf41
2019-04-17 17:23:23 +03:00
Daimona Eaytoy 4b10a544ab Add tests for tokenizer caching
Caching the result of the tokenization is pretty important
performance-wise, so this test ensures that caching works as expected.
I have also extracted the method used to generate the cache key for
easier testing, and moved the cache instance to a class member because
otherwise that piece of code can't be tested...

Bug: T201193
Change-Id: I9b3bc36b552901bc6ca7609ee51e80be2979a9c4
2019-04-15 16:59:55 +02:00
Daimona Eaytoy ec110c657b Add tests for various data type casts
These are the ones which other tests don't cover, mostly because no
filter syntax can trigger those cases. This patch should bring coverage
for AFPData to 100%.

Bug: T201193
Change-Id: I997576141943959d4602a9f839311108928ec766
2019-04-14 14:08:57 +02:00
Daimona Eaytoy 909eec6716 Tweak coverage part 2
Follow-up of Ic30883f7d261d974a2be46308d023e2714119e95, with two files
that I forgot to git-add and a repositioning of comments to avoid the
last bracket to be reported as uncovered.

Bug: T201193
Change-Id: I6bf7e5892a0f49f6a138792f0aedf230a70c18a8
2019-04-13 19:26:01 +02:00
Daimona Eaytoy 4bcb64b01a Increase code coverage a bit
This patch mostly adds coverageIgnore comments for intendedly
unreachable code etc. Some of them could be made testable by adding a new
filter function (e.g. array cast), but this patch is meant to be
comment-only (aside from the parser test).
Ignoring coverage for these lines makes some methods reach 100%
coverage, which in turn makes it easier to look at the coverage chart
and identify at a glance which parts of the code *really* need to be
covered.

Bug: T201193
Change-Id: Ic30883f7d261d974a2be46308d023e2714119e95
2019-04-13 18:30:14 +02:00
Daimona Eaytoy 8293ec176f Add tests for storing and loading the variables dump
These are specific tests for storeVarDump and loadVarDump, both alone
and in the context of running filters.
Also, include disabled variables in the VariableHolder object if they're
saved in the DB.

Bug: T201193
Depends-On: Ia5c477edc8733bb1994cb6d01e1371ed496c8bcb
Change-Id: I5e35d773904a62105767ce6d7d962ab5525c2d12
2019-04-12 08:03:33 +00:00
jenkins-bot c0da9ff3ac Merge "Clean AbuseFilterParserTests" 2019-04-11 21:46:50 +00:00
Brad Jorsch b59f19d675 AbuseFilterTest: Don't use $wgUser when creating pages
Which means we have to pass a user to WikiPage::doEditContent().

Follows up Ifbcd9adf3.

Change-Id: I1bd0288cc132627d75b4001219522ec5e952eda7
2019-04-09 12:25:34 -04:00
jenkins-bot cc670f0a07 Merge "Clean the AbuseFilterTest class" 2019-04-06 14:47:52 +00:00
jenkins-bot efe32b7c93 Merge "Add doc for every class member" 2019-04-06 14:37:19 +00:00
jenkins-bot d53c84da36 Merge "Restore check for dividebyzero" 2019-04-06 12:35:23 +00:00
jenkins-bot e03488b66a Merge "Overhaul tag selector" 2019-04-06 12:35:20 +00:00
Brad Jorsch 5ace1121b0 Actually create user in AbuseFilterConsequencesTest
If the User passed to $logEntry->setPerformer() represents a creatable
username, then it has to actually exist so the actor row can be created.

Bug: T188327
Change-Id: Iab2fc9593a020ffacd219d644103d685028e3336
2019-04-05 12:35:25 -04:00
Daimona Eaytoy 0ff581e246 Clean AbuseFilterParserTests
Mostly delete result files and assume the result is always true. The few
exceptions were either moved to standalone test, or inverted.

Change-Id: I6c06e596587750c4ebaabafbd277bc75eeb436a5
2019-03-23 12:59:03 +01:00
Daimona Eaytoy 72c2be7a18 Remove $wgAbuseFilterRuntimeProfiling
The reasoning is similar to the one of the parent patch (Ia5c477edc8733bb1994cb6d01e1371ed496c8bcb). Plus, it records runtime metrics on action different than edits, as there's no reason not to do it.
No performance issues in production.

Bug: T191039
Depends-On: Ia5c477edc8733bb1994cb6d01e1371ed496c8bcb
Change-Id: Ib1112e2fefd0631550d386ba87e5f87db84c3036
2019-03-23 11:31:18 +00:00
Daimona Eaytoy 89520e2353 Remove $wgAbuseFilterProfiling
This variable was introduced to selectively enable profiling because
stats recording was bad for performance. Nowadays, stats are recorded in
a deferredupdate and don't harm performance anymore. Thus, this variable
can be removed and profiling be enabled by default.

Bug: T191039
Depends-On: Ib5fdeb75c1324f672b4ded39681f006fde34b4d1
Change-Id: Ia5c477edc8733bb1994cb6d01e1371ed496c8bcb
2019-03-23 11:31:11 +00:00
Daimona Eaytoy 9144f20245 Restore check for dividebyzero
Follow-up of I1721a3ba532d481e3ecf35f51099c1438b6b73b2. This is the only
wrong replacement: strict checking will let 5 / 0.0 pass, with
unexpected results. Adding a regression test for it, too.

Change-Id: I25dbe9fafa92fd9a11bd8bc6ab8e66f305b8d48e
2019-03-23 11:38:39 +01:00
Daimona Eaytoy f2c1beec44 Replace double-equals with triple-equals
Since double-equals are evil. I left some of them in place where I
wasn't sure, but I may be changed some which were intended to be
doubles. It could be a good idea to delay merging this patch until we'll
have more code coverage.

Change-Id: I1721a3ba532d481e3ecf35f51099c1438b6b73b2
2019-03-22 16:12:13 +01:00
Daimona Eaytoy d6c649bb0d Overhaul tag selector
If "tag" option is selected and the form is submitted without adding any
tag, just show it blank instead of adding an empty tag to the topbar.
Separately validate the empty tag case (and added a test for it).

Bug: T203353
Depends-On: I3b2e763bd8835207dc5df1db43d3e1881e6961c3
Change-Id: I8884b739fd17fa2eace5aac8775d3524aa606f1f
2019-03-17 14:04:50 +00:00
Daimona Eaytoy bedbe36744 Add doc for every class member
Adding PHPdocs to every class members, in every file. This patch only
touches comments, and moved properties on their own lines. Note that
some of these properties would need to be moved, somehow changed, or
just removed (either because they're old, unused leftovers, or just
because we can move them to local scope), but I wanted to keep this
patch doc-only.

Change-Id: I9fe701445bea8f09d82783789ff1ec537ac6704b
2019-03-17 11:40:24 +01:00
jenkins-bot 3f3e98fbc5 Merge "Fix shortcircuit for consecutive operations" 2019-03-17 10:04:14 +00:00
Daimona Eaytoy 683e94cdd3 Clean the AbuseFilterTest class
Remove all globals, make methods non-static, improve assertions and
computing some variables, add names to the tests and other minor
improvements.

Change-Id: Ifbcd9adf34d173d0da0aa568fc6f91fdc2d61609
2019-03-17 11:04:10 +01:00
jenkins-bot e2f1880922 Merge "Don't use wgLang and wgContLang" 2019-03-17 09:53:16 +00:00
jenkins-bot 65a4c26804 Merge "Remove exclusions for Generic.Files.LineLength" 2019-03-17 09:49:38 +00:00
Kunal Mehta 577f4dab93 Migrate to new phan
Bug: T216904
Change-Id: I30864bd3d7f9b9ab674bf6589cd9e5e3aed5bb8d
2019-03-16 09:41:23 +00:00
Daimona Eaytoy dd4b579695 Remove exclusions for Generic.Files.LineLength
Keep it only for filters definitions in ConsequencesTests.

Change-Id: I305c7f496a29b20a3ee1d34479d1e4cb9252060a
2019-02-23 10:12:07 +01:00
Thalia 540a557a59 Replace calls to deprecated Block::prevents
Where prevents is used as a setter, use the new setter methods;
where it is used to determine whether a block blocks the target
from editing their talk page, use appliesToUsertalk.

Block::prevents was deprecated and replaced by several other
methods in I0e131696419211.

Bug: T211578
Change-Id: I166cc6f64c0f895ff8c631d2655c1c3208131371
2019-02-22 19:29:02 +00:00
Thiemo Kreuz 3993a7ea15 Replace @expectedException with $this->expectException()
The @expectedException annotation got deprecated in PHPUnit 7.5, and
removed in PHPUnit 8.0. This was done because the annotation does have
two disadvantages:
* The class name is encoded in string, where it is not easy to find for
  all IDEs and tools.
* it did not allow to say exactly *when* the exception is expected.

Change-Id: I85f0b5f44b2f400a121115d402b64827ea534c32
2019-02-19 10:58:16 +01:00
Daimona Eaytoy 6f4bfc9597 Fix shortcircuit for consecutive operations
Using break could halt parsing between operations, instead use continue
to parse all operations.

Bug: T214642
Change-Id: If67ddaffef280c2448c55ae536013758617bba68
2019-02-08 17:55:59 +00:00
Tim Starling c889c2990c In tests that create users, add 'user' to $this->tablesUsed
Change-Id: I7d2c6b304974d487e1b7727f594d0843ff080a7d
2019-02-08 16:40:17 +11:00
Daimona Eaytoy 51120e51c5 Don't use wgLang and wgContLang
For wgLang, there's a Language object available in the proximity, so just pass it.
For wgContLang, use MediaWikiServices.

Change-Id: Ic492007f2d5eeb8048d0919a4b9b7dd98c15c350
2019-02-06 12:00:44 +01:00
jenkins-bot 15a8340ee1 Merge "Reject empty warning and disallow messages when validating a filter" 2019-01-31 21:28:17 +00:00
Daimona Eaytoy 0f041e8282 Split AbuseFilterConsequencesTest tests in several methods
This makes the code easier to maintain and more flexible, plus adds
several tests. Some flaky tests are also improved.

Depends-On: I57ce67c5202c8574fcf1957999a6999fec264cb7
Change-Id: Ibb5322bca93b464e9014b53644c04f2bc1141e72
2019-01-23 21:26:25 +00:00
Daimona Eaytoy 26b783f062 Use data provider's array keys to specify test description
We just passed the description as a parameter, but it's much quicker to
use it as the key in the data provider: PHPUnit will automatically
display it in case of failure, so that we don't have to do that
manually (and still get messages like "failed with data set #7").

Depends-On: I8edcca17ecdcf71397cc9b0d101e8b13ac112047
Change-Id: I57ce67c5202c8574fcf1957999a6999fec264cb7
2019-01-23 21:26:17 +00:00
Daimona Eaytoy 0e6b783ed4 Reject empty warning and disallow messages when validating a filter
Right now, we allow empty messages, and when the "warn" action is
executed we use "abusefilter-warning" if no message is specified.
However, this also produces a PHP notice while editing a filter with
empty message (see Phab). With this patch, empty messages will be
rejected, and a follow-up will be discussed on Phab.

Update: added disallow message as follow-up of
Ic1de03a6944c43a346fa317ee0a217551f0d284a.

Bug: T203353
Depends-On: I8df247f61d9f3769e9580544f324dd174811e939
Change-Id: I71b1f81d10c02de4de141b1ab9b630d05cf4619c
2019-01-21 14:06:54 +01:00
jenkins-bot df2da23d29 Merge "Add unit tests for custom disallow messages" 2019-01-19 12:21:02 +00:00
jenkins-bot b44984c50a Merge "Remove unused stuff" 2019-01-19 12:18:22 +00:00
jenkins-bot 575646393b Merge "Improve code readability" 2019-01-19 12:11:06 +00:00
jenkins-bot a2bee3bcf3 Merge "Simplify parser methods" 2019-01-19 12:11:04 +00:00
jenkins-bot 0d4e982069 Merge "Reduce code duplication" 2019-01-19 12:00:47 +00:00
Daimona Eaytoy 6217ffb928 Remove unused stuff
Variables declared but never used, redundant code, and old leftovers.

Change-Id: Ic51044a45a1b49ad6c7af06c646b11893411a7cd
2019-01-18 17:04:19 +01:00
Daimona Eaytoy 93e8cb5ac5 Tune logging channel
As follow-up of I10b1fd2d9bdfe518089c053d77fef568170ecb65, use
'AbuseFilter' instead of 'AbuseFilterDeprecatedVars' as channel name.
Raise level for null-title filtering. Since with a null title
several things are likely to break, a warning is more appropriate here.
Tweaked the message as well, to include the bug number and to avoid
pointlessly including the title (which is null).
Lower the level for stashedit hit/miss (as it's really spammy and not
that useful right now).
Use 'abusefilter' instead of 'AbuseFilter' for statsd so that everything
has the same prefix.
Also raise the level for parser exceptions and unrecognized
consequences.

Change-Id: I1f9988155e924232b201281795cd322636da8082
2019-01-16 08:56:22 +00:00
Daimona Eaytoy f12fdb4a32 Add unit tests for custom disallow messages
Follow-up of Ic1de03a6944c43a346fa317ee0a217551f0d284a, adding some unit
tests for this newly introduced feature, plus a couple of tweaks for
both tests themselves and i18n.

Change-Id: I8df247f61d9f3769e9580544f324dd174811e939
2019-01-05 10:58:47 +00:00
Thiemo Kreuz 8ccb9839e5 Add test to guarantee tag uniqueness
This is a direct follow up for the bug fixed in Iebbdeac.

Change-Id: I5cc5618aa6161460534804e46a8a3568d1af9af3
2018-12-31 18:26:47 +01:00
daniel 688eccea47 Expose text from all slots to AbuseFilter
This is a first step towards MCR support in AbuseFilter. The textual
representation of all slots is concatenated. Since AbuseFilter uses
getTextForSearchIndex to determine the textual representation of
content, blind concatenation should not break any assumptions
made by AbsueFilter rules: this naive approach is no worse than
AbuseFilters handling of non-textual content in general, and should
work fine for textual content.

Bug: T209291
Change-Id: Ic141085cad2e11bfe106fe83dafcb35ac31206ba
2018-12-05 09:24:08 -08:00
Daimona Eaytoy 206bdc1f6a Use the updated TitleMove hook to filter move actions
For several reasons:
*We're not really checking permissions (and the hook previously used is
meant to be used in such case)
*We'll show a cleaner error message (i.e. without the "You do not have
permission..." part)
*Filtering will happen closer to the actual move

Bug: T208907
Depends-On: I4733724075b7514e9db59e7be772d9409aa9da87
Change-Id: If88f736a446247f8b4b13c055c641d56f544d1ea
2018-12-04 18:58:04 +01:00
Amir Sarabadani fd3e3e78cb Migrate AbuseFilterConsequencesTest from tag_summary to change_tag
Bug: T209525
Change-Id: I6ab0b29800d7654164e8d23fb24b81529b0d2c88
2018-11-28 08:04:51 +01:00
Daimona Eaytoy 7427333ed5 Improve code readability
Simplify some logic constructs, reduce the amount of return statements
inside methods, explicitly declare variables before using them, reduce
code duplication, add names to JS anonymous function to produce clearer
stack traces.

Change-Id: Ife4546a91c30d4c519d09a712ba56a2f33abe579
2018-11-19 16:01:37 +01:00
Daimona Eaytoy e055ecc7c6 Reduce code duplication
Change-Id: I03bd56e4bf455865b27338ac39b3dcef20a88447
2018-11-19 15:50:36 +01:00
Daimona Eaytoy 4480c9493a Remove wgParser and wgRequest
As part of the deprecation process of non-config globals.

Change-Id: Ia84ddc20adbfda72347cf256601050b055b87ecf
2018-11-19 13:40:58 +01:00
jenkins-bot 213c2aa011 Merge "Change throttle selector to restore old functionality, overall improvement" 2018-11-15 00:58:11 +00:00
Daimona Eaytoy d3a8491c3f Change throttle selector to restore old functionality, overall improvement
Long (sigh) explanation in T203587#4569698. Also, simplified the way
TagMultiselect are generated, this one and the one for change tags.
This new selector is back-compat both with the old textarea and the OOUI
checkboxMultiselect; actually, this one is //fully// compatible with the
old textarea.
Add validation for throttle parameters and unit tests for validation
(split from I976c95658cddb2585910b6f8a5f047aadc4e4d47).
Added a trim when retrieving throttle identifier to allow syntax like
'ip, user'.
Improved the message shown on history.
Re-added the maintenance script to clean DB.

As I wrote in the task, a review by two other people would be great, at
least for the maintenance script (it could potentially break the DB).

Bug: T203587
Bug: T203336
Bug: T203584
Bug: T203585
Depends-On: I3b2e763bd8835207dc5df1db43d3e1881e6961c3
Change-Id: I7831dbb0bab55807392ac1f7915d6cb0cb713593
2018-11-14 12:51:36 +01:00
Brad Jorsch f6349e7a32 Update tests that fail with comment/actor migration
* AbuseFilterConsequencesTest is somehow leaving blocks behind. Mark
  ipblocks as being used to avoid that.
* AFComputedVariable::getLastPageAuthors() uses indeterminate order for
  multiple revisions with the same timestamp. Fall back to rev_id
  ordering like MySQL accidentally did before.
* AbuseFilterTest tries to create revisions attributed to users that
  don't exist. Switch to interwiki usernames.

Change-Id: I30f7cdcc3875f3f7af116c1e41e88f62ab9e91d0
2018-11-09 17:03:36 -05:00
jenkins-bot 108ec1117f Merge "Reload the test user instance before checking the edit count" 2018-10-23 18:53:33 +00:00
Aaron Schulz 3191c2adc4 Reload the test user instance before checking the edit count
These are updated in deferred updates and should not rely on the same
User instance being used in those updates. This also avoids convoluted
logic in User to set the new edit count for various cases.

Change-Id: I6d239a5ea286afb10d9e317b2ee1436de60f7e4f
2018-10-23 18:06:12 +00:00
jenkins-bot 7e151f5edc Merge "Unbreak short circuit for arrays" 2018-10-18 04:04:31 +00:00
jenkins-bot c2f5540928 Merge "Use fake timestamps for time-related tests" 2018-10-13 22:19:31 +00:00
Daimona Eaytoy cbd57fe7a1 Simplify test parameters
Instead of having lots of huge arrays, use a fixed one and only
overwrite the needed parameters.

Change-Id: I3b2e763bd8835207dc5df1db43d3e1881e6961c3
2018-10-12 16:40:44 +02:00
Daimona Eaytoy 2ad63c95ef Use fake timestamps for time-related tests
With the hope to finally unbreak such tests, making them much more
stable and clean.

Bug: T206501
Change-Id: I275a088b9b21f47892b4e3c4cd11ef8680a9e6d9
2018-10-10 20:26:52 +02:00
Daimona Eaytoy 70b60e5906 Simplify user_age test
This simplifies the test for user_age, although I'm not totally sure it
will be fixed. AFAICS, there's nothing wrong in there, but we'll see on
future phpunit executions.

Bug: T206501
Change-Id: Iee1a2a65d08c2cffc7a0d655be1eadb018d8bf37
2018-10-09 12:46:47 +02:00
Daimona Eaytoy 6d54b83f2c Simplify parser methods
Use a single function to check parameters amount, avoid duplication
between keywordIn and keywordContains, use if...elseif instead of
if-else when statements have a return inside, simplify some other logic,
add typehinting, and change method visibility according to use of such
methods.

Change-Id: I22225a5cbbb93679a0e78bf6e15866829167fbf4
2018-10-03 17:19:40 +02:00
Daimona Eaytoy e60dacbbea Fix code comments
Fixed some comments adding explanations, fixing syntax, and parameter types
for docblocks. Also fixed some whitespace mess, and added a missing use
statement.

Change-Id: I3547c90bdaa2cab5443e8bf0c63b217fe6ba663f
2018-10-03 16:45:03 +02:00
Daimona Eaytoy d9d5af3890 Unbreak short circuit for arrays
This problem have been making filters potentially fail silently since
2009. Also add tests for arrays to make sure that no problems arise
when short circuit is used.

Bug: T204841
Change-Id: Ie4e2e06498c1202ba73afcc5d164a72427abbca5
2018-10-03 16:44:10 +02:00
jenkins-bot 121df619da Merge "Improve coverage for AbuseFilterTokenizer" 2018-09-09 12:30:49 +00:00
Daimona Eaytoy bffba28713 Add full tests for deprecated variables
This test checks every deprecated variable to be identical to the
newly-named one, and to emit a debug notice. It also changes such debug
to be emitted via logger instead of wfDebug.

Bug: T201193
Bug: T173889
Change-Id: Ie55746bb7731062ae2d46d84857af2a05d78cf4c
2018-08-29 11:00:28 +02:00
Daimona Eaytoy 775c736512 Improve coverage for AbuseFilterTokenizer
This will make tokenizer almost fully covered. The only uncovered parts
are the one with cache and an else condition which I think won't ever be
executed, and thus added a comment for that. Also, remove an obsolete
xxx comment from ComputedVariable (fixed in
I8e420f0259ef6c9e579f7a00beb58f28af9da37d)

Bug: T201193
Change-Id: I6e9a73aa9e437f096f6a1e20d53a7cb50e5ed85d
2018-08-25 10:25:16 +02:00
jenkins-bot 1b5428b9c8 Merge "Improve tests for the AbuseFilter class" 2018-08-23 14:41:00 +00:00
jenkins-bot 826e600731 Merge "Add _age variables to tests" 2018-08-23 14:35:33 +00:00
jenkins-bot 97f98b029d Merge "Improve parser coverage" 2018-08-23 14:33:54 +00:00
Daimona Eaytoy 078e9a3d21 Improve tests for the AbuseFilter class
Add some test cases for conds limit, profiling and other minor things.

Bug: T201193
Change-Id: I9a3035459cafd6537111cf1dea1a2d9a4bd34036
2018-08-23 14:14:57 +00:00
jenkins-bot ad69ea648e Merge "Remove unused function and improve unit test" 2018-08-23 13:46:41 +00:00
Daimona Eaytoy 0e2ae113fb Improve parser coverage
On the way to 100%...

Bug: T201193
Change-Id: I5fd311f861acccb31f346da9acb379b0366488e7
2018-08-23 12:13:47 +02:00
Daimona Eaytoy 03b52c2b37 Remove unused function and improve unit test
AbuseFilterParser::setVars is only used in a parser test. In the past it
was also used in the actual code (see for instance
https://phabricator.wikimedia.org/diffusion/EABF/browse/master/;5cc8dac63ca585c288ca4c8605db810774e39666?grep=setVars), but at the moment it's pretty unuseful.
This patch removes such function and makes the unit test use literals
instead of variables to avoid calling it.

Change-Id: I80cbc4033ff96f2fe8c1da263b1877bfb4c7c0c4
2018-08-23 11:00:16 +02:00
Daimona Eaytoy 90260edad0 Add _age variables to tests
Tests for new variables introduced in
I0993cecc322806382a1b567b60c0a4af69054841.

Change-Id: Iadaa33c20eb26d6e76ac02e3e9c0066b904833bc
2018-08-23 10:50:52 +02:00
Daimona Eaytoy 447d434e2a Improve code coverage
Add some parser tests, improve existing ones, and add missing @covers.

Bug: T201193
Change-Id: I9c0d2d83560baa4a3e1d4465b7919a48c4e26ac1
2018-08-22 19:07:14 +02:00
Daimona Eaytoy d35c42757c Add missing @covers tag
This should help with tracking code coverage and also explains some
coverage discrepancies encountered while writing other tests.

Bug: T201193
Change-Id: I8b20abc46c2d6c6f582953139b9a9f3710b2e4ea
2018-08-22 17:00:38 +02:00
jenkins-bot d94cc34649 Merge "Add deprecated variables to PHPUnit tests" 2018-08-22 12:44:23 +00:00
jenkins-bot a762c82fe7 Merge "Add aliases for "_text" and "article_" variables" 2018-08-22 12:44:20 +00:00
jenkins-bot 777a86314e Merge "Improve code coverage for AbuseFilterParser" 2018-08-22 11:15:00 +00:00
Daimona Eaytoy cd30d5146f Add deprecated variables to PHPUnit tests
Check a bunch of them, they should be computed and be identical to the
ones with new syntax.

Bug: T173889
Depends-On: I5c370b54e6516889624088e27928ad3a1f48a821
Change-Id: I276913a98e06b5f2ff1c5f5f3ba5bcc7b1e8c997
2018-08-22 08:38:31 +00:00
Daimona Eaytoy c962203ad2 Raise tolerance for time-related unit tests to 10 seconds
This helps avoiding failures with tests depending on execution time.

Bug: T202073
Change-Id: I4da859cfb3e49314ca20329e2ad4a3a7c4fae897
2018-08-21 17:18:24 +02:00
Daimona Eaytoy 6bc630cfef Add aliases for "_text" and "article_" variables
Variables regarding title (full list in task description) are quite
deceiving, since they use "text" instead of "title". As proposed in the
task, this is the first patch to add aliases for those variables and
slightly deprecate the old ones. In the future we may be able to replace
every occurrence (either with a search function or directly on the
database), but even a coexistence would be enough to avoid
confusion. A wfDebug log is generated whenever a deprecated variable is
parsed. The "article_" prefix is also changed to "title_", in the same
way as above.
Also, added a hook which other extension may use to specify their
deprecated variables, which will be handled the same as core ones.

Bug: T173889
Change-Id: I5c370b54e6516889624088e27928ad3a1f48a821
2018-08-21 16:59:56 +02:00
Daimona Eaytoy 4f3b020f5d Improve code coverage for AbuseFilterParser
Add some tests and improve others to raise coverage percentage. This
should lead to almost 100% for the AbuseFilterParser class. Aside from
this, a couple of changes:
* Remove an unused function
* Let equals_to_any return a genuine result with empty strings
* Remove an if which will never be true in skipOverBraces, since the
function is called after checking the same conditions.

Bug: T201193
Change-Id: I7020b2ed996236c38c5784d161ad98ec44163406
2018-08-20 14:38:40 +02:00
Umherirrender c954b412c6 Include CheckUser in phan config
Depends-On: I51421184485c3117bbab9ce3dd42f2dbb6c6180c
Change-Id: Ida17580b301ff4a6b0d3d0020c48f65eb1e21026
2018-08-17 17:38:01 +02:00
Daimona Eaytoy bb476e2c45 Fix wrong error message for PHPUnit
We're currently emitting the same error twice, but in one of those cases
it's completely wrong. Damned copy&pasting!

Bug: T202073
Change-Id: I7687826a85f3ef0abaf15d7cd973afc4e55758b2
2018-08-16 17:11:41 +00:00
Daimona Eaytoy 0026a68a8a Add PHPUnit tests for various generic functions
Adding tests for generic functions in AbuseFilter class, ranging from
simple utility function to variable computation.

Bug: T42478
Change-Id: I903fb7ffbc436b27462e3e4611ab65ecb8a543ba
2018-08-09 19:20:46 +00:00
jenkins-bot 4b185b3749 Merge "Add phpunit tests for noparams and notenoughargs exceptions" 2018-08-02 08:41:46 +00:00
jenkins-bot 75729b6195 Merge "Add other phpunit test for AFPUserVisibleException" 2018-08-02 08:07:56 +00:00
Daimona Eaytoy 9440828d13 Add phpunit tests for creating and editing filters
Adding the template for unit tests and some tests. These should cover
all the validation failure cases.

Bug: T42478
Depends-On: Ib7a0335fa7fb3b8a21765438a720205656c1ea09
Change-Id: I3fd0d627295d680ed33b1cbc730435df0446277f
2018-07-18 12:30:55 +02:00
Daimona Eaytoy 5c6007e041 Add tests for filter consequences
The last one of what I think are the must-have tests. This patch
provides the basic tests and the framework, which may be further
expanded later on. Please note that the failures are due to an actual
problem in core, for which there is I7bb0e92b2906a2511fc4290bdc76fc39ec4617fe.

Bug: T42478
Change-Id: I28eb464c63fda7faa3ec7d1f6082f36154d66962
2018-07-15 15:43:18 +00:00
Daimona Eaytoy 4f037c29c2 Add phpunit tests for noparams and notenoughargs exceptions
We're really missing exception tests: in fact, 'noparams' not being
thrown was discovered only a few days ago and worked like that for
years. This patch adds phpunit tests for both noparams and notenoughargs
exception, also checking the returned message.

Depends-On: I484fe2994292970276150d2e417801453339e540
Change-Id: Ia0b9b8fd5c979be06879723b746f9356c628f5cd
2018-07-15 17:35:45 +02:00
Daimona Eaytoy e9921bcda7 Add other phpunit test for AFPUserVisibleException
Follow-up of Iacb8f7a361079e3e117dc6845597c7bd8473e54a for exceptions
thrown outside the parser. With this patch all uses of AFPUserVisibleException
will be covered.

Depends-On: Iacb8f7a361079e3e117dc6845597c7bd8473e54a
Change-Id: Ia7ef6eb832d5725a804a60cb58bc110b06c8abe2
2018-07-01 18:34:01 +02:00
Daimona Eaytoy 7a64280893 Add phpunit tests for all exception thrown in the parser
All uses of "throw" inside AbuseFilterParser are now covered.
Bonus: added a standard suppresswarning when checking regex validity.

Change-Id: Iacb8f7a361079e3e117dc6845597c7bd8473e54a
2018-07-01 18:31:11 +02:00
Daimona Eaytoy c75bc35f7d Rename lists to arrays
Arrays were introduced with the name "lists". While it **may** look
user-friendlier and so on, it actually uses a wrong name: lists are
different from arrays. I ran a grep and I should've replaced
every occurrence, plus everything seems to work, however a double check
wouldn't be bad.

Change-Id: I6a858f02f5dd9250ba7e1abf9c6422fd98758c9e
2018-06-26 14:42:23 +02:00
Huji Lee 2792fce41e Introduce sanitize() function
Normalizes HTML entities into unicode characters

Bug: T169122
Change-Id: Ic916a6f8976e486d62d65156fa2dab56a55cf22a
2018-06-03 16:37:23 -04:00
Daimona Eaytoy caa4b1c763 Add phan configuration
This is taken from I6a57a28f22600aafb2e529587ecce6083e9f7da4 and makes
all the needed changes to make phan pass. Seccheck will instead fail,
but since it's not clear how to fix it (and it is non-voting), for the
moment we may merge this and enable phan on IC.

Bug: T192325
Change-Id: I77648b6f8e146114fd43bb0f4dfccdb36b7ac1ac
2018-04-30 08:32:58 +00:00
Daimona Eaytoy 9eea111d9f Sync parser tests with examples on mediawiki
I added on MW an example of comparison with empty array, which we should
keep inside the dedicated test as well.

Change-Id: Ifa4bca85c8978ef24ed5bb26787730bb4521261f
2018-04-26 18:47:51 +02:00
jenkins-bot 6aa6b8fc13 Merge "Add the remaining equality checks" 2018-04-26 13:25:56 +00:00
Daimona Eaytoy 71f375f19a Add equals_to_any function
Introduce a new function which can be used to group multiple comparisons
in a single condition. In particular, equals_to_any(S, A, B) is the
equivalent of S === A || S === B. This is especially useful in checking
for multiple namespaces, as proposed in the Community health initiative.

Change-Id: I9dcfe303eb5e51e1882fe4a65fa876aa93db7686
2018-04-25 23:12:19 +00:00
Daimona Eaytoy 24c8d7d54e Add the remaining equality checks
I left as ToDo the checks between an array and something else. With this
patch, it'll work like PHP: the result will be true iff the comparison
is loose, the array is empty and the other operand is either false or
null.

Change-Id: Idc5cadb697ed4fc7f4856967274169f77495ed9f
2018-04-25 10:16:50 +02:00
Daimona Eaytoy 3c3a521fec Fix coding conventions exclusion rules
This should fix every error with excluded rules, leaving only the one
for $wgTitle. A double check would be nice in order to avoid regressions
due to stupid mistakes.

Bug: T178007
Change-Id: I22c179f3a01d652640304b59e43fcb5b5a9abac3
2018-04-20 08:40:18 +00:00
Daimona Eaytoy 8cfd527f31 Reinforce parser tests
Some of them are actually too simple, and may be unuseful in tricky
situations. This patch adds a lot of test cases to provide an (almost)
bombproof safety with future patches.

Depends-On: I0bb1ed0109af66997e238b532d342d82d4c4ae19
Change-Id: I274ef306775c36be20acb662353f6537ff3f1a33
2018-04-09 16:25:54 +02:00
Daimona Eaytoy 2dda2e381c Convert division/multiplication/modulo results after calculation
So that type and value will be identical to PHP's ones.

Bug: T191688
Depends-On: I1140900cdda63eed292d9f20aefd721ef9247fcd
Change-Id: I398c9a972b7e9fcb27d055d23939be2b8bb68244
2018-04-09 16:16:04 +02:00
Daimona Eaytoy 284ab234fd Allow comparing two lists
This feature was never implemented. I'm not sure whether we need a way to compare array and other types of variables (left as ToDo), since e.g. in PHP it's always false.

Bug: T179238
Change-Id: I5d2c33fd117e69cbc84c0b04b6cb82edbdcadf16
2018-04-06 11:44:28 +00:00
Max Semenik a5b92a90c0 Fix license header
Change-Id: Ifb6b2d39fab9375e09c22e87ec818d74bd22fb28
2018-04-03 02:16:33 +00:00
Max Semenik 5c89246fce Rename files to match class name
Change-Id: Ia19bfec6c2289912699b6c90261afda311afb56e
2018-04-02 22:08:13 -04:00
libraryupgrader df05002739 build: Updating mediawiki/mediawiki-codesniffer to 17.0.0
The following sniffs are failing and were disabled:
* MediaWiki.Commenting.LicenseComment.InvalidLicenseTag

The following sniffs now pass and were enabled:
* MediaWiki.Commenting.FunctionComment.MissingParamComment

Change-Id: I38c334ea6c6ff07dfcb64d551413a02dc8c5e51e
2018-03-28 23:38:50 +00:00
Umherirrender e01a06df7d Move @group from file comment to class comment
Phpunit is only looking at class comment for annotations

Change-Id: Ic98f5d995051c5fc2a41c3c31b2fdbd39af028b1
2018-03-16 22:00:56 +00:00
Daimona Eaytoy a0de056299 Add contains_all and ccnorm_contains_all functions
Added the contains_all function, with basically the same role as
contains_any but using logic AND instead of OR. Also added
ccnorm_contains_all, that is the same of ccnorm_contains_any but with
AND mode. Finally, fixed three wrong task IDs.
Co-authored with Valerio Bozzolan.

Bug: T21176
Change-Id: Ib0a8b783db6ce0d5db64771c8e0c70f0f8d13d36
2018-02-09 17:33:24 +01:00
Kunal Mehta 5238c8e8b5 Improve @covers tags
Change-Id: I3df3698b5d3f3eae95db8c740c611f365ff9cb31
2018-01-23 14:08:52 -08:00
Daimona Eaytoy 4e20c933f4 Add get_matches function
Added the get_matches function to store a regex match.

Bug: T179957
Change-Id: I19366ebcaa4d0f007dd675a61c91457dde57f604
2017-11-13 17:32:45 +01:00
David Barratt 5335b6c811 Use Equivset library intead of AntiSpoof
Use the new equivset library instead of AntiSpoof.

Bug: T175413
Change-Id: I439387deeba99543e194c210953ac73ff98bc5b7
Depends-On: I977d3498b2084a426e2ab4d85c000d1b9dcfe824
2017-10-21 21:55:18 -07:00
Dayllan Maza 2bc8873c30 Add ccnorm_contains_any function
Normalize and search a string for multiple substrings

Bug: T65242
Change-Id: I4034c0054a6849babbf2d96ea13dc97d3660d5b4
2017-10-06 11:32:45 -04:00
Umherirrender 1a58507870 build: Updating mediawiki/mediawiki-codesniffer to 0.10.0
Change-Id: I5f37c45d748d5f0da21aceaef32cc89367e312ff
2017-07-08 20:49:30 +02:00
Umherirrender a063e33ee8 Use short array syntax
Done by phpcbf over composer fix

Change-Id: I53fd1fc8d056b9b60194d2d630852cfca37aadea
2017-06-15 17:02:57 +02:00
Victor Vasiliev 46faa02c49 Fix the associativity of boolean logic operators
Change-Id: Icaf0fde0d74064532af4b110faef4014f8303f80
2016-11-06 20:30:07 -05:00
Victor Vasiliev aa399da279 Implement a tree-caching abuse filter parser
This filter is fully functional.  The old filter is still enabled by
default for a transitional period in case the new one suddenly has
issues.

Change-Id: I4aea5f00c62420108030e60e79d5bf34e913e95d
2016-09-24 02:53:26 +00:00
Victor Vasiliev 5da98b67bf Add test coverage for more bizzare features of the filter parser
I am pretty sure all of the behavior documented in these tests is a bad
idea.  It is possible that we can fix it since some of those features
are probably unused, but for now those tests will serve as a
documentation of the current behavior.

Change-Id: Ia2a2f57a538d7aef2ac73fb2e47fe82dd5d5e09a
2016-08-21 18:45:22 -04:00
Kaldari acd28cb00f Update tests for AntiSpoof fixes
Bug: T29987
Depends-On: Iccb3e50073bbbc2b979cb62dd0e129afd1c2e55f
Change-Id: I8bef839b9b9ca5fced94ce6428e769133ede868f
2016-08-13 20:37:43 +00:00
Bartosz Dziewoński 5fc30112c7 Optimize 'count()' function
substr_count() is just as fast as looped strpos() when there are no
matches, and gets faster as the number of matches increases.

Note that this introduces a small change in behavior when the needle
is composed of repeated substrings, e.g. 'asdasdasd' or 'aa', and
haystack is such that the needle can be matched in overlapping
positions, e.g. 'asdasdasdasd' or 'aaaaa'. The old implementation
counted overlapping matches, the new one doesn't. I don't think this
behavior was intentional and I don't think this change will cause any
real problems.

Change-Id: Icc905ca34bf08d63e969787a5e3c119d498bf878
2016-04-17 08:32:27 +02:00
Bartosz Dziewoński 7d83540527 Add some tests for behavior of 'count()' function
Change-Id: I29a6c91d0780dc9a1eaee6d29d3b1f9c9c708df7
2016-04-17 08:18:29 +02:00
Bartosz Dziewoński e79b45b71f Improve ignoring short-circuited operations
Previously, 'false & a == b' would actually execute the comparison and
count it against the condition limit, while 'false & (a == b)' wouldn't.
They behave the same now.

mShortCircuit was only checked for the most potentially expensive
operations (computing functions and getting variables), all the other
operations on bogus values generated by this would be executed and the
results ignored later.

This probably doesn't noticeably improve performance, but it corrects
how the condition limit is counted.

Bug: T43693
Change-Id: Id1d5f577b14b6ae6d987ded12689788eb7922474
2016-04-09 16:25:52 +02:00
Bartosz Dziewoński 3b32cf00e9 Improve how the number of conditions is counted
With the new behavior, the number of conditions in incremented when:
* Evaluating a function
* Evaluating a comparison operator (== === != !== < > <= >= =)
* Evaluating a keyword (in like matches contains rlike irlike regex)

Previously, the number of conditions was incremented when:
* Evaluating a function
* Entering the comparison operator evaluation mode

This resulted in a number of surprising behaviors. In particular:
* '(((a == b)))' counted as 4 conditions, not 1
* 'contains_any(a, b, c)' counted as 5 conditions, not 1
* 'a == b == c' counted as 1 condition, not 2
* 'a in b + c in d + e in f' counted as 1 condition, not 3
* 'true' counted as 1 condition, not 0

It is still possible to easily cheat the count by rewriting comparisons
as arithmetic operations. I believe this is meant to advise users of
the complexity of their rules and not really enforce strict limits.

Bug: T132190
Change-Id: I897769db4c2ceac802e3ae5d6fa8e9c9926ef246
2016-04-09 16:16:27 +02:00
Ori Livneh bab9832415 Move rule tokenization to new AbuseFilterTokenizer class
* Move AbuseFilterParser::nextToken() and the various AbuseFilterParser
  properties that accompanied it to a new class, AbuseFilterTokenizer.
* Tokenize rules eagerly and cache the result in APC.

Change-Id: I15f5b5b65e8c4ec4fba3000d7c9fd78b98967d1d
2015-08-25 14:00:10 -07:00
Ori Livneh b388dfab1b Clean-up of AbuseFilterParser::nextToken()
No functional changes.

* Don't include $code as part of the return value; it is ignored anyway.
* Removed AbuseFilterParser::lastHandledToken and AFPParserState::lastInput,
  because AbuseFilterParser::nextToken() no longer calls itself recursively.
* The regular expression that matches operators is no longer constructed
  dynamically, but hard-coded into the class. To make sure it does not drift
  apart from the more legible AbuseFilterParser::$mOps, add a unit test that
  constructs the regex dynamically as before and compares it to
  AbuseFilterParser::OPERATOR_RE.
* AbuseFilterParser::RADIX_RE ditto.

Change-Id: I9c23b60759ed2f4c73a9b480243b16bbce5a208f
2015-08-25 10:50:31 -07:00
Ori Livneh 0e36b728e3 Fix double escaping in AFPData::keywordLike()
If we don't map '\-' and '\+' to themselves, the leading slash gets escaped,
and the resultant pattern only matches a literal slash.

Bug: 67670
Change-Id: Ifa1e3edd6f41985a3bb97bfb1497985f8fa64af5
2014-07-11 14:56:42 -07:00
Marius Hoch 35747761fb Allow running the AbuseFilter parser tests via phpunit
I've also added myself to the credits file as I'm the only
maintainer of this extension for a while now.

Change-Id: Id998172ea2abd70b8243de9db1a96cc2cfa47a64
2013-07-08 19:22:43 +02:00
jenkins-bot 3c83358506 Merge "Add parser tests for bug 25373" 2013-05-01 21:25:11 +00:00
Kunal Mehta 4bec58cd54 Add a "ucase" function to convert the provided string to uppercase.
I basically took the lcase code and tweaked it to work for uppercase.

Bug: 47321
Change-Id: I230dbd99c27bf3a4a042befd6d334b4c0439bde0
2013-04-17 11:48:15 -05:00
Marius Hoch 3010d78950 Add parser tests for bug 25373
Change-Id: I2f2524731098f323e61bbc0442e7b56b11cdea37
2013-03-23 21:49:57 +01:00
Marius Hoch 03da29b9da Fix the abusefilter array parser test
The abusefilter array test failed because length( ['a', 'b', 'c'] )
returned 12 instead of 6. That was du to it converted the array
to a string with new line seperated values first before measuring
the string length. Changed that behaviour to act like the php count()
function or the python len() function which seems far more useful to me.
The old behaviour can be established using length( string( array ) ).

Change-Id: I16646891837c9743ca5af2dd328077a7225bb5f1
2012-12-20 02:19:55 +01:00
Alexandre Emsenhuber 56e6f0a262 svn:eol-style native 2009-04-09 20:45:31 +00:00
Victor Vasiliev 27fb1303a8 * Use lists instead of implode()d strings in built-in variables wherever it's possible
ATTENTION! This may break filters that rely on "added_lines contains 'bla-bla'" syntax. They'll need to be replaced with "string(added_lines) contains 'bla-bla'"
2009-04-05 19:07:47 +00:00
Victor Vasiliev 128ae5983b Introduce list (non-associated array) support into abuse filter parser. 2009-04-05 17:11:17 +00:00
Victor Vasiliev 258d340fb5 Abuse filter:
* Introduce := operator for setting variables
* Throw an exception when user tries to override built-in variable
* Fix UTF-8 handling in fnmatch() fallback
* Copy three main abuse filters from enwiki to test suite
* Fix update.php integration
2009-04-05 11:47:42 +00:00
Andrew Garrett 86e4081206 Abuse Filter Parser:
* Efficiency -- use /A instead of PREG_OFFSET_CAPTURE and comparing offsets.
* Expand error messages to enhance debugging.
* General code quality
2009-03-25 11:36:38 +00:00
Andrew Garrett 0880f444b1 Abuse Filter Parser updates:
* Use strcspn to scan ahead for long regions of uninteresting text in string handling (performance).
* Remove cruft specific to my system in phpTest.php.
* Remove a test that was in incorrect syntax, and useless without adding variable support.
2009-02-11 18:23:21 +00:00
Andrew Garrett bfe57be65d Rewrite of Abuse Filter parser tokeniser.
I've made it more performant and fixed a few bugs by using regexes
instead of PHP loops, where possible, under the assumption that the
PCRE parser is more efficient than the same thing implemented in pure PHP.
Also, I'm now passing the same string around and calculating offsets, which
Tim tells me is far more performant than continually truncating the same string.

All tests still pass, with the exception of string.t, which I've modified
to remove the offending code, which never worked.
2009-02-11 01:41:51 +00:00
Andrew Garrett 53179c675f Apply changes from change-tagging branch. I will remove all of the stuff actually related to change tagging in a moment, to avoid trunk changes on Wikimedia sites. 2009-01-23 19:23:19 +00:00