Commit graph

64 commits

Author SHA1 Message Date
thiemowmde 24888bea15 Mark protected stuff in classes with no subclasses as private
Protected effectively means "public to subclasses" and should be
avoided for the same reasons as marking everything as public should
be avoided.

Change-Id: Iba674b486ce53fd1f94f70163d47824e969abb77
2023-06-23 12:28:06 +02:00
Timo Tijhof 203d54be11 BlockedExternalDomains: Optimize host extraction by using parse_url
Unlike what the 20-year old source comments in UrlUtils.php would
have you believe, parse_url() works fine nowadays, including for
protocol-relative URLs and indeed lots of prod code uses it directly.

The class still has some convenience value for case where you need to
expand or manipulate URLs, but for the common case of extracting a part
of it, you really don't need it.

Test plan:
$ php phpunit.php ../../extensions/AbuseFilter/tests/phpunit/integration/FilteredActionsHandlerTest.php

Bug: T337431
Change-Id: I1e76d2f5aef65365743214530faba656325b965a
2023-06-19 13:36:27 +00:00
Amir Sarabadani 8b67de5bc1 blocked domains: Make sure users can't bypass the list by using uppercase
Added tests too

Bug: T337431
Change-Id: Ie3406d0b3c7d82ba44c11865e493375453555664
2023-06-16 01:22:48 +02:00
jenkins-bot 596a36866b Merge "Add missing AbuseFilterServices::getHookRunner()" 2023-06-15 18:06:28 +00:00
thiemowmde 7e6132d4d7 Remove bits of unused code across the codebase
Mostly found with the code inspection tools in PHPStorm.

Change-Id: I7f59dddca0aaab0ddd1093d52c07ec12efd20d6d
2023-06-14 19:41:00 +00:00
Lucas Werkmeister 9bb4b1e5db Add missing AbuseFilterServices::getHookRunner()
And register AbuseFilterRunnerFactory as a service name that’s allowed
to not have a getRunnerFactory() method without the test complaining
(the service was renamed, getFilterRunnerFactory() exists).

Change-Id: Idedb87e64a6df02b0edae8d9e7dbf441752dc480
Needed-By: If5af88e7f70b83d53f66b9617a5ef37daf81830f
2023-06-14 17:35:43 +02:00
Matěj Suchánek 8fb53edfbb Retrieve external links from PreparedUpdate
When forFilter is true and PreparedUpdate is available
(most save operations), retrieve all_links from
PreparedUpdate::getParserOutputForMetaData. Otherwise
do what was done before.

Note that this change probably leaves some dead code. It will be dealt
with later.

NOTE: this changes code potentially executed on every save operation.

Bug: T65632
Bug: T264104
Change-Id: I3628a56e5277846c1b90444fb55983870eb54c1e
2023-06-13 14:30:06 +02:00
Amir Sarabadani 60cbc3b464 BlockedDomains: Use cleaner array building and add tests
Regarding array building: Instead of adding to array with
$array[] = 'foo' and then doing array_flip(), simply do
$array['foo'] = true;

Regarding tests: I originally wanted to create a unit test but I ended
up mocking so many things that it wasn't worth it and the config variable
is globaly which first we need to clean up after deployment is done.

Bug: T337431
Change-Id: Iac8dca7078668ee3441d19b6aafe499c1aa0d732
2023-06-12 17:46:55 +00:00
Amir Sarabadani 0acfe05251 Add abusefilter-bypass-blocked-external-domains right
This is similar to sboverride right in SpamBlacklist. Defaults are also
the same

Bug: T337431
Change-Id: Iaff91c1f9f7aece0787348dd071701ef99e0291d
2023-06-08 22:06:19 +02:00
Amir Sarabadani 53eb27f086 Introduce Special:BlockedExternalDomains
It is behind a feature flag. Improvements on it can happen in follow
ups. The patch is already quite massive.

Bug: T337431
Bug: T279275
Change-Id: I3df949c4d41ce65bb4afa013da9c691ac05fc760
2023-05-30 20:48:42 +02:00
Umherirrender faaa5126eb tests: Make some PHPUnit data providers static
Initally used a new sniff with autofix (T333745)

Bug: T332865
Change-Id: I892127a7cf794c52b1106d0239d273476a6113c3
2023-05-20 21:44:55 +02:00
Bartosz Dziewoński 0364194d72 API tests: Assert error codes, not error messages
Depends-On: I752f82f29bf5f9405ea117ebf9e5cf70335464ad
Needed-By: Ie17987991d1e9a0d77da97e3a81fe0a21c6d7866
Change-Id: I06c89534be605557ee9b0d90d2748f806fa2ae9e
2023-04-26 13:21:53 +02:00
Matěj Suchánek 0628dbdab6 Add tests for extension.json and services
Change-Id: Ie83e4a85a408e1ba1d2cc827c4bf353bdd5500df
2023-03-28 09:35:02 +02:00
Matěj Suchánek bb78cb0a56 Use actor table in AbuseFilter
This patch migrates abuse_filter and abuse_filter_history tables
to new actor schema.

MigrateActorsAF was copy-pasted from core's
maintenance/includes/MigrateActors.php before removal (ba3155214).

Bug: T188180
Change-Id: Ic755526d5f989c4a66b1d37527cda235f61cb437
2023-03-22 14:01:29 +01:00
Matěj Suchánek 86ac5bdb40 Clean up database access in non-deployed code
Change-Id: Ibcc41c2dd7f60a806199eaa2c47628a28dadd143
2023-03-03 18:55:08 +01:00
Matěj Suchánek 702d77e3ce Create real integration test for variables
For fixing bugs like T65632, T105325, or T264104, we will need
to update code in more than one place at once. To prevent
regressions, create an integration test which tests the whole
pipeline, from the request submission to variable evaluation.
Edits are simulated using action=edit API call because the hook
AbuseFilter uses is run from EditPage.

To increase confidence in test coverage, remove some annotations
from AbuseFilterConsequencesTest or make them less greedy.
Ideally, it would only test consequences.

This patch includes refactoring of AbuseFilterCreateAccountTestTrait
which now only inserts the user into the database if it really
should be created.
It also restores test coverage of some other classes.

Change-Id: I661f4e0e2bcac4770e499708fca4e4e153f31fed
2022-11-26 18:51:38 +01:00
Reedy 4f4f01f96d EchoNotifierTest: Use namespaced Event class
Re-enables test

Depends-On: Ib57ea2db947285946f31fa9912b37181044df9d3
Change-Id: I082868f4759a5da14235803ebd8a80e794cfe41c
2022-11-12 06:28:33 +00:00
Reedy 97e0f30155 EchoNotifierTest: Temporarily skip testNotifyForFilter
Depends-On: Iddb4a5d4057f9c6ed00f754d2e3cd79cd873f212
Change-Id: Id28792658de950b99a8786f881563476def59eba
2022-11-03 00:28:15 +00:00
Umherirrender da7683bcbc tests: Improve tests for postgres
Change-Id: I9720b6c7d096ae8415c00eb0ac1ddc461ea0a8dc
2022-07-09 21:40:27 +00:00
Matěj Suchánek e7492a230f Replace unnecessary use of User
In action=abusefilterunblockautopromote, leave UserIdentity
instantiation to the parent. Note that this changes the "code"
in the response from "baduser_user" to "baduser".

Change-Id: I97d2bf3fa3c5486e461823f840cad2763e1bcfea
2022-07-02 23:58:08 +00:00
DannyS712 139ca18efe Migrate AbuseFilterPermissionManager to authority
Almost all callers already provide an Authority in the form
of a User object, so mostly just need to change the typehints

Depends-On: I58661943c7e1acb6ff09798ee1a30be0fde3f459
Change-Id: I2ad86859c8194c14d7331f58db62b7cff4698085
2022-07-01 06:58:17 +00:00
Umherirrender 32a97e8d15 tests: MWTimestamp::setFakeTime is reset by core
It is in MediaWikiTestCaseTrait since 438b392

Change-Id: Ib89406fdbad0c9fecada50c8f1ee45e27d17c522
2022-06-28 20:48:31 +00:00
Matěj Suchánek 7ae2060b27 Avoid array to object cast in filterToDatabaseRow
Both callers immediately call get_object_vars
to cast it back to array. Avoid this roundtrip.

Change-Id: I6525d76f8a03a4d28c2b50b580c539affe98064f
2022-06-28 18:46:28 +00:00
Thiemo Kreuz bbded6231c Inline/simplify smaller pieces of duplicate/complex PHP code
Change-Id: I59d0f17b77c8c3d47bc532bdefd9d8c0883f180b
2022-06-03 21:04:38 +02:00
Daimona Eaytoy 8ee9a21750 Clean up test files
Convert a few integration tests to unit tests now that it's possible,
split the AbuseFilterSaveTest file into three different classes.

Change-Id: Ia2c0d7ab878b20a89324336a532abdc44f1e6b74
2022-03-20 17:40:49 +00:00
Daimona Eaytoy b5c22f2b77 Improve wording for throttled filter warnings
List which actions were disabled, or explicitly say that no actions were
disabled if that's the case. Also avoid the word "throttle" in messages
as it may be hard to translate. Also don't suggest optimizations to the
filter conditions -- unoptimized rules have nothing to do with a filter
being throttled.

Bug: T200036
Change-Id: Id989fb185453d068b7685241ee49189a2df67b5f
2022-02-22 11:10:19 +00:00
jenkins-bot a332b3ff0f Merge "Remove afl_filter entirely" 2021-09-25 01:39:08 +00:00
Daimona Eaytoy e8471a717c Add method to properly check visibility of AbuseLog entries
This replaces the previous pattern of callers having to use
RevisionLookup if the result was 'implicit'. Also, in some cases where
we were just hiding things if the visibility was !== true, properly
handle the implicit case by using the new method. Make the new method
return string constants rather than bool|string.

The new method also fixes some potential info leaks which happened when
the row was hidden, the user could view suppressed AbuseLog entries, but
the associated revision was also deleted and the user couldn't see it
(this shouldn't be relevant for WMF wikis since AF deletion is
oversight-level).

Also add a bunch of tests for the various cases to ensure we don't
regress again.

Bug: T261532
Change-Id: I929f865acf5d207b739cb3af043f70cb59243ee0
2021-09-25 00:08:33 +00:00
Daimona Eaytoy dae374aec2 Remove afl_filter entirely
As per T220791, the old schema and the flag can be removed in 1.38.

Bug: T220791
Change-Id: Ic6b1c8a22d17a301faf32d2e23778d90c41c39de
2021-09-18 11:06:10 +00:00
Daimona Eaytoy b2dc2c4dd8 Refactor ParserStatus
ParserStatus is now more lightweight, and doesn't know about "result"
and "from cache". Instead, it has an isValid() method which is merely a
shorthand for checking whether getException() is null.

Introduce a child class, RuleCheckerStatus, which knows about result and
cache and can be (un)serialized.

This removes the ambiguity of the $result field, and helps the
transition to a new RuleChecker class.

Change-Id: I0dac7ab4febbfdabe72596631db630411d967ab5
2021-09-17 11:25:54 +00:00
Daimona Eaytoy 7c26c4b8d5 More cleanup for parser-related classes
Change-Id: I6a2bbf519e1d5c6fe2778f69624bd80b9ea1ef86
2021-09-10 12:50:20 +00:00
Daimona Eaytoy a722dfe1a4 Rename ParserFactory -> RuleCheckerFactory
The old parser now has the correct name "Evaluator", so the
ParserFactory name was outdated. Additionally, the plan is to create a
new RuleChecker class, acting as a facade for the different
parsing-related stages (lexer, parser, evaluator, etc.), which is what
most if not all callers should use. The RuleCheckerFactory still returns
a FilterEvaluator for now.
Also, "Parser" is a specific term defining *how* things happen
internally, whereas "RuleChecker" describes *what* callers should expect
from the new class.

Change-Id: I25b47a162d933c1e385175aae715ca38872b1442
2021-09-08 21:59:34 +02:00
Daimona Eaytoy 357ddd498c Clean up / simplify parser-related classes
Remove unnecessary setters, injecting everything in the constructor.
These were leftovers from before the introduction of ParserFactory.
Remove public access to the conds used, include the information inside
the returned ParserStatus instead, and consequently simplify callers.

Change-Id: I0a30e044877c6c858af3ff73f819d5ec7c4cc769
2021-09-08 13:41:52 +02:00
Daimona Eaytoy f8e9ac7e2a Rename AbuseFilterCachingParser -> FilterEvaluator
It's an evaluator, not a parser.

Change-Id: Ib6d33e8423ea72709cf5a33f4397ba33e352ea80
2021-09-08 13:40:47 +02:00
Daimona Eaytoy 6684ea6450 Remove AFPTransitionBase
Also cleanup the mPos hack in the CachingParser.

Change-Id: Ib5693802a3ceb80cb736880ed65e27340abef689
2021-09-06 19:33:48 +00:00
Sorawee Porncharoenwase 320e3d696f Add a static analyzer for the filter language
This commit adds a class AFPSyntaxChecker which can statically analyze
a filter code to detect the following errors:

- unbound variables (which comes in two modes: conservative and liberal,
  default to conservative)
- unused variables (disabled by default for compatibilty)
- assignment on built-in identifiers
- function application's arity mismatch
- function application's invalid function name
- non-string literal in the first argument of set / set_var

The existing parser and evaluator are modified as follows:

- The new (caching) evaluator no longer needs to perform variable
  hoisting at runtime.
  - Note that for array assignment, this changes the semantics.
- The new parser is more lenient, reducing parsing errors.
  The static analyzer will catch these errors instead, allowing us
  to give a much better error message and reduces the complexity of
  the parser.
  * The parser now allows function name to be any identifier.
  * The parser now allows arity mismatch to occur.
  * The parser now allows the first argument of set to be any expression.

Concretely, obvious changes that users will see are:

1. a := [1]; false & (a[] := 2); a[0] === 1

   would evaluate to true, while it used to evaluate to the undefined value
   due to hoisting

2. f(1)

   will now error with 'f is not a valid function' as opposed to
   'Unexpected "T_BRACE"'

3. length

   will now error with 'Illegal use of built-in identifier "length"'
   as opposed to 'Expected a ('

Appendix: conservative and liberal mode

The conservative mode is completely compatible with the current evaluator.
That is,

false & (a := 1); a

will not deem `a` as unbound, though this is actually undesirable because
`a` would then be bound to the troublesome undefined value.

The liberal mode rejects the above pattern by deeming `a` as unbound.
However, it also rejects

true & (a := 1); a

even though (a := 1) is always executed. Since there are several filters
in Wikimedia projects that rely on this behavior, we default the mode
to conservative for now.

Note that even the liberal mode doesn't really respect lexical scope
appeared in some other programming languages (see also T234690).
For instance:

(if true then (a := 1) else (a := 2) end); a

would be accepted by the liberal checker, even though under lexical scope,
`a` would be unbound. However, it is unlikely that lexical scope
will be suitable for the filter language, as most filters in
Wikimedia projects that have user-defined variable do violate lexical scope.

Bug: T260903
Bug: T238709
Bug: T237610
Bug: T234690
Bug: T231536
Change-Id: Ic6d030503e554933f8d220c6f87b680505918ae2
2021-08-31 03:28:24 +02:00
Daimona Eaytoy 704364a5e7 Move parser exceptions to specific namespace and rename them
Create a dedicated "Exception" sub-namespace and remove the "AFP"
prefix, a leftover from the pre-namespace era.

Change-Id: I7e5fded9316d8b7d1628bc1a6ba8b1879ac901e1
2021-08-29 23:38:31 +00:00
libraryupgrader 5377ebe819 build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 36.0.0 → 37.0.0

npm:
* postcss: 7.0.35 → 7.0.36
  * https://npmjs.com/advisories/1693 (CVE-2021-23368)

Change-Id: I2b382f3bb236fb44eb24c6a257b13b8fd886541c
2021-07-21 18:51:18 +00:00
jenkins-bot 0dc93136d6 Merge "Improve test coverage of API modules" 2021-04-18 16:03:25 +00:00
Matěj Suchánek a2ee8c41e2 Improve test coverage of API modules
Also solve one a TODO.

Change-Id: I61a38f3c741274f00ad0ad4789106a943daef222
2021-04-18 10:37:38 +02:00
Daimona Eaytoy f8438a4647 Remove the old parser
All methods were moved to the new parser. Tests and other pieces were
adjusted to expect just a single parser. There are still some TODOs
(remove AFPTransitionBase, remove $this->mCur), but these are left for
another commit.

Note that the new parser was not renamed: this is because the names are
wrong anyway (CachingParser is more of an Evaluator than a Parser, and
AFPTreeParser is the real parser, and should be renamed as well).

NOTE to reviewers: this patch looks quite big, but if you diff the old
parser with the new version of the CachingParser, you'll notice that the
diff is actually small, since everything was basically copied verbatim.

Bug: T239990
Change-Id: Ie914ef64c70503a201b4d2dec698ca2fa8e69b10
2021-04-09 13:23:07 +00:00
Umherirrender 32f7ae140e Use ::class for class name
This works also for non-existing classes,
because it is resolved on compile time

Change-Id: Ia3a1484c9c4f46a128c367ddd057c41dd560111d
2021-04-08 20:54:48 +02:00
Daimona Eaytoy 4dbde4dcf0 Use a different message prefix for parser warnings
The abusefilter-warning prefix is reserved for filter warnings. Pointed
out by Matěj.

Change-Id: I169e4c3d29b08c7f5af2136a683fc4427f8e93f5
2021-02-06 15:42:33 +00:00
jenkins-bot a94b2247f6 Merge "Cover some API modules by tests" 2021-02-04 23:43:02 +00:00
Matěj Suchánek a0fcfbcc32 Cover some API modules by tests
Change-Id: Icc57e260b3b06a58fc05f304d6e63dc40f970fe9
2021-02-04 15:17:00 +01:00
Daimona Eaytoy b0058c0f1b Use Authority in TextExtractor
And make its test a pure unit test, as per TODO comment.

Change-Id: Ia3ca38702ea61c5e551a581248d2b9471ef881fb
2021-02-02 00:43:01 +00:00
Petr Pchelko 6aa8f6f67b Do not mock User in TextExtractorTest.
In I63d9807264d7e2295afef51fc9d982447f92fcbd we are
changing how the permission checks are applied for revision,
so it uses passed User instance as Authority. However, when
user is mocked, the tests are breaking since the new user methods
are not mocked. Pass a real user for now to fix the test. Once
Authority reaches maturity and is ok to use in extensions, the
test should be rewritten to use authority directly.

Bug: T271458
Change-Id: Iacab813b253cc6e1439007e573e8ace06645860f
2021-01-20 09:32:18 -06:00
Daimona Eaytoy 005cc83642 Increase coverage for more classes
Change-Id: Iae6a24291f821fda77a45d8c1584de010af6a834
2021-01-17 17:38:58 +00:00
Daimona Eaytoy 8639e0c368 Introduce subclasses of Filter with specific use cases
In particular, this brings stronger typing for getID(), and we can get
rid of many phan suppressions.

Change-Id: Icbf3a6f7db8105082646ec227f62c09449fb165d
2021-01-17 00:47:29 +00:00
Daimona Eaytoy 8368b5d9b7 Use overrideUserPermissions in TextExtractorTest
This allows merging I1acd55c07d07b4a0d43fd838e11374b6d9be98d9.

Change-Id: I99ab3a69c41b3ec6721f9504ad6c77d3122df591
2021-01-06 12:46:11 +01:00