Commit graph

23 commits

Author SHA1 Message Date
Daimona Eaytoy 775c736512 Improve coverage for AbuseFilterTokenizer
This will make tokenizer almost fully covered. The only uncovered parts
are the one with cache and an else condition which I think won't ever be
executed, and thus added a comment for that. Also, remove an obsolete
xxx comment from ComputedVariable (fixed in
I8e420f0259ef6c9e579f7a00beb58f28af9da37d)

Bug: T201193
Change-Id: I6e9a73aa9e437f096f6a1e20d53a7cb50e5ed85d
2018-08-25 10:25:16 +02:00
Daimona Eaytoy 447d434e2a Improve code coverage
Add some parser tests, improve existing ones, and add missing @covers.

Bug: T201193
Change-Id: I9c0d2d83560baa4a3e1d4465b7919a48c4e26ac1
2018-08-22 19:07:14 +02:00
Daimona Eaytoy 4f3b020f5d Improve code coverage for AbuseFilterParser
Add some tests and improve others to raise coverage percentage. This
should lead to almost 100% for the AbuseFilterParser class. Aside from
this, a couple of changes:
* Remove an unused function
* Let equals_to_any return a genuine result with empty strings
* Remove an if which will never be true in skipOverBraces, since the
function is called after checking the same conditions.

Bug: T201193
Change-Id: I7020b2ed996236c38c5784d161ad98ec44163406
2018-08-20 14:38:40 +02:00
Daimona Eaytoy c75bc35f7d Rename lists to arrays
Arrays were introduced with the name "lists". While it **may** look
user-friendlier and so on, it actually uses a wrong name: lists are
different from arrays. I ran a grep and I should've replaced
every occurrence, plus everything seems to work, however a double check
wouldn't be bad.

Change-Id: I6a858f02f5dd9250ba7e1abf9c6422fd98758c9e
2018-06-26 14:42:23 +02:00
Huji Lee 2792fce41e Introduce sanitize() function
Normalizes HTML entities into unicode characters

Bug: T169122
Change-Id: Ic916a6f8976e486d62d65156fa2dab56a55cf22a
2018-06-03 16:37:23 -04:00
Daimona Eaytoy 9eea111d9f Sync parser tests with examples on mediawiki
I added on MW an example of comparison with empty array, which we should
keep inside the dedicated test as well.

Change-Id: Ifa4bca85c8978ef24ed5bb26787730bb4521261f
2018-04-26 18:47:51 +02:00
jenkins-bot 6aa6b8fc13 Merge "Add the remaining equality checks" 2018-04-26 13:25:56 +00:00
Daimona Eaytoy 71f375f19a Add equals_to_any function
Introduce a new function which can be used to group multiple comparisons
in a single condition. In particular, equals_to_any(S, A, B) is the
equivalent of S === A || S === B. This is especially useful in checking
for multiple namespaces, as proposed in the Community health initiative.

Change-Id: I9dcfe303eb5e51e1882fe4a65fa876aa93db7686
2018-04-25 23:12:19 +00:00
Daimona Eaytoy 24c8d7d54e Add the remaining equality checks
I left as ToDo the checks between an array and something else. With this
patch, it'll work like PHP: the result will be true iff the comparison
is loose, the array is empty and the other operand is either false or
null.

Change-Id: Idc5cadb697ed4fc7f4856967274169f77495ed9f
2018-04-25 10:16:50 +02:00
Daimona Eaytoy 8cfd527f31 Reinforce parser tests
Some of them are actually too simple, and may be unuseful in tricky
situations. This patch adds a lot of test cases to provide an (almost)
bombproof safety with future patches.

Depends-On: I0bb1ed0109af66997e238b532d342d82d4c4ae19
Change-Id: I274ef306775c36be20acb662353f6537ff3f1a33
2018-04-09 16:25:54 +02:00
Daimona Eaytoy 2dda2e381c Convert division/multiplication/modulo results after calculation
So that type and value will be identical to PHP's ones.

Bug: T191688
Depends-On: I1140900cdda63eed292d9f20aefd721ef9247fcd
Change-Id: I398c9a972b7e9fcb27d055d23939be2b8bb68244
2018-04-09 16:16:04 +02:00
Daimona Eaytoy 284ab234fd Allow comparing two lists
This feature was never implemented. I'm not sure whether we need a way to compare array and other types of variables (left as ToDo), since e.g. in PHP it's always false.

Bug: T179238
Change-Id: I5d2c33fd117e69cbc84c0b04b6cb82edbdcadf16
2018-04-06 11:44:28 +00:00
Daimona Eaytoy a0de056299 Add contains_all and ccnorm_contains_all functions
Added the contains_all function, with basically the same role as
contains_any but using logic AND instead of OR. Also added
ccnorm_contains_all, that is the same of ccnorm_contains_any but with
AND mode. Finally, fixed three wrong task IDs.
Co-authored with Valerio Bozzolan.

Bug: T21176
Change-Id: Ib0a8b783db6ce0d5db64771c8e0c70f0f8d13d36
2018-02-09 17:33:24 +01:00
Dayllan Maza 2bc8873c30 Add ccnorm_contains_any function
Normalize and search a string for multiple substrings

Bug: T65242
Change-Id: I4034c0054a6849babbf2d96ea13dc97d3660d5b4
2017-10-06 11:32:45 -04:00
Victor Vasiliev 46faa02c49 Fix the associativity of boolean logic operators
Change-Id: Icaf0fde0d74064532af4b110faef4014f8303f80
2016-11-06 20:30:07 -05:00
Victor Vasiliev aa399da279 Implement a tree-caching abuse filter parser
This filter is fully functional.  The old filter is still enabled by
default for a transitional period in case the new one suddenly has
issues.

Change-Id: I4aea5f00c62420108030e60e79d5bf34e913e95d
2016-09-24 02:53:26 +00:00
Victor Vasiliev 5da98b67bf Add test coverage for more bizzare features of the filter parser
I am pretty sure all of the behavior documented in these tests is a bad
idea.  It is possible that we can fix it since some of those features
are probably unused, but for now those tests will serve as a
documentation of the current behavior.

Change-Id: Ia2a2f57a538d7aef2ac73fb2e47fe82dd5d5e09a
2016-08-21 18:45:22 -04:00
Kaldari acd28cb00f Update tests for AntiSpoof fixes
Bug: T29987
Depends-On: Iccb3e50073bbbc2b979cb62dd0e129afd1c2e55f
Change-Id: I8bef839b9b9ca5fced94ce6428e769133ede868f
2016-08-13 20:37:43 +00:00
Bartosz Dziewoński 5fc30112c7 Optimize 'count()' function
substr_count() is just as fast as looped strpos() when there are no
matches, and gets faster as the number of matches increases.

Note that this introduces a small change in behavior when the needle
is composed of repeated substrings, e.g. 'asdasdasd' or 'aa', and
haystack is such that the needle can be matched in overlapping
positions, e.g. 'asdasdasdasd' or 'aaaaa'. The old implementation
counted overlapping matches, the new one doesn't. I don't think this
behavior was intentional and I don't think this change will cause any
real problems.

Change-Id: Icc905ca34bf08d63e969787a5e3c119d498bf878
2016-04-17 08:32:27 +02:00
Bartosz Dziewoński 7d83540527 Add some tests for behavior of 'count()' function
Change-Id: I29a6c91d0780dc9a1eaee6d29d3b1f9c9c708df7
2016-04-17 08:18:29 +02:00
Bartosz Dziewoński e79b45b71f Improve ignoring short-circuited operations
Previously, 'false & a == b' would actually execute the comparison and
count it against the condition limit, while 'false & (a == b)' wouldn't.
They behave the same now.

mShortCircuit was only checked for the most potentially expensive
operations (computing functions and getting variables), all the other
operations on bogus values generated by this would be executed and the
results ignored later.

This probably doesn't noticeably improve performance, but it corrects
how the condition limit is counted.

Bug: T43693
Change-Id: Id1d5f577b14b6ae6d987ded12689788eb7922474
2016-04-09 16:25:52 +02:00
Ori Livneh 0e36b728e3 Fix double escaping in AFPData::keywordLike()
If we don't map '\-' and '\+' to themselves, the leading slash gets escaped,
and the resultant pattern only matches a literal slash.

Bug: 67670
Change-Id: Ifa1e3edd6f41985a3bb97bfb1497985f8fa64af5
2014-07-11 14:56:42 -07:00
Marius Hoch 35747761fb Allow running the AbuseFilter parser tests via phpunit
I've also added myself to the credits file as I'm the only
maintainer of this extension for a while now.

Change-Id: Id998172ea2abd70b8243de9db1a96cc2cfa47a64
2013-07-08 19:22:43 +02:00