Commit graph

1675 commits

Author SHA1 Message Date
jenkins-bot 9aa87e9234 Merge "Clean up AbuseFilterViewTestBatch" 2023-07-12 13:12:47 +00:00
Tim Starling fe592746b7 Use the new Wikimedia\Diff namespace
Bug: T339184
Change-Id: I381686678524868c85466bdafde3856a73a8cb1c
2023-06-29 11:56:13 +10:00
Abijeet b1e404fc79 ConsequencesFactory: Avoid creating Session object during service wiring
Service wiring should only depend on config, not on request state.

Creating a session object during service wiring causes issues with entry
points such as opensearch_desc.php that disable the session.

Bug: T340113
Change-Id: I2450b0b6821ff0b097e283ff660a0b8aeea9590a
2023-06-27 20:11:38 +05:30
Matěj Suchánek c2a40fb0ff Clean up AbuseFilterViewTestBatch
Inject dependencies, use implicit form validation.

Change-Id: I74afeeceb39ada93cf3c20d5d3fc417ab4e3bf4b
2023-06-27 10:53:45 +02:00
jenkins-bot c897335bd7 Merge "Various code style clean-ups" 2023-06-23 18:43:58 +00:00
jenkins-bot 0c33716f5b Merge "Mark protected stuff in classes with no subclasses as private" 2023-06-23 18:35:48 +00:00
thiemowmde 9316a7d65f Mark some unused public class features as private
These are not used anywhere outside of these classes.

Change-Id: I0a0a5cf1e84133bae69b95da771c285ee27f926c
2023-06-23 12:32:38 +02:00
thiemowmde d9bca83ec6 Various code style clean-ups
For example:
* Use the more meaningful str_contains().
* Add missing type hints.
* Make use of early returns/guard clauses.

Change-Id: Id150d1b17a80ea637a0639a8f2fd7fd017ad23b1
2023-06-23 12:32:12 +02:00
thiemowmde 24888bea15 Mark protected stuff in classes with no subclasses as private
Protected effectively means "public to subclasses" and should be
avoided for the same reasons as marking everything as public should
be avoided.

Change-Id: Iba674b486ce53fd1f94f70163d47824e969abb77
2023-06-23 12:28:06 +02:00
thiemowmde 0bb3aa38ed Fix removing a domain when the page doesn't exist
This was an unfortunate mistake in the refactoring in I2ccb587,
caused by incomplete documentation and a confusing mixture of
possible return types.

I9166c2b fixed one of the two places already. The situation in this
patch here cannot really happen in reality (there is nothing to
remove when the page is empty). Still I think the code is easier to
read when the two places behave the same.

Change-Id: Iea51c3a7a8185cbc3771143353f4795dde712ec4
2023-06-22 11:54:53 +02:00
Amir Sarabadani 8f216a6030 Fix adding a domain when the page doesn't exist
It should fail on null but it should create the page if it doesn't
exist or doesn't have any content yet.
This is breaking the special page, see:
[[de:234828092#New_special_page_to_fight_spam_//_Neue_Spezialseite_zur_Spam-Bekämpfung]]

Change-Id: I9166c2bdcfacb4b19706d246fbf99b2f24ca4cc6
2023-06-22 08:36:49 +00:00
Timo Tijhof 110484b6a0 BlockedExternalDomains: De-duplicate validateDomain logic
Bug: T337431
Change-Id: Icbedf750f6ecaa9caf7bb900e8ad0bc2124e8743
2023-06-19 13:36:32 +00:00
Timo Tijhof 203d54be11 BlockedExternalDomains: Optimize host extraction by using parse_url
Unlike what the 20-year old source comments in UrlUtils.php would
have you believe, parse_url() works fine nowadays, including for
protocol-relative URLs and indeed lots of prod code uses it directly.

The class still has some convenience value for case where you need to
expand or manipulate URLs, but for the common case of extracting a part
of it, you really don't need it.

Test plan:
$ php phpunit.php ../../extensions/AbuseFilter/tests/phpunit/integration/FilteredActionsHandlerTest.php

Bug: T337431
Change-Id: I1e76d2f5aef65365743214530faba656325b965a
2023-06-19 13:36:27 +00:00
Timo Tijhof ee238e79b9 BlockedExternalDomains: Minor code clean up and docs improvement
* Remove stray `@ingroup` from file blocks, move to class block.

* Fix mention of "WAN" cache where actually APCU is used.

* Document that the storage class takes a local-server cache.
  This is an important requirement since the class has no
  coordination for purging or other invalidation. It expects
  an uncoordinated cache.

* Rename "load" to "loadConfig" as it's ambigious what it means among
  the half dozen other "load*" methods in this class. Also inline
  loadFromConfig and loadComputedUncached while at it to further
  reduce this.

* Rename "loadConfigContent" to "fetchLatestConfig" to match
  the existing fetchConfig, which does the same thing except it queries
  the primary db using READ_LATEST.

* Use Html.php when building HTML, instead of legacy Xml.php.
  While at it, also switch a few to Html::element instead of
  Html::rawElement (aka Xml::tags) by using Message->text() for
  messages that are not expected to contain rich wikitext.

Change-Id: Ic74d1597aa9201b371894e7a4bf9361752d9db21
2023-06-19 13:36:23 +00:00
Amir Sarabadani 9dc1a601ac Blocked domains: Fix removing a domain via the special page
Doing unset on array leads to the final array turning into associative array
and gets blocked by the validator.

You can check that it's broken in Persian Wikipedia, beta cluster or
localhost. Tested locally, fixes the issue.

Bug: T337431
Change-Id: Ib1be294bae1ae057dfb9a4445a8e13ac72b333b9
2023-06-18 00:35:21 +02:00
Amir Sarabadani 8b67de5bc1 blocked domains: Make sure users can't bypass the list by using uppercase
Added tests too

Bug: T337431
Change-Id: Ie3406d0b3c7d82ba44c11865e493375453555664
2023-06-16 01:22:48 +02:00
jenkins-bot 596a36866b Merge "Add missing AbuseFilterServices::getHookRunner()" 2023-06-15 18:06:28 +00:00
jenkins-bot 12d6d204ce Merge "BlockedDomains: Add logging in case of hit" 2023-06-15 16:33:37 +00:00
Amir Sarabadani da53cfe9dd BlockedDomains: Add logging in case of hit
This is basically copy paste of SpamBlacklist logging with the added
extra bit of what triggered the hit.

Bug: T337431
Change-Id: Ieb9e3ca615af88ab56735b56e24c80c42a68d478
2023-06-14 22:23:50 +02:00
thiemowmde b63d5c138e Use much more narrow IReadableDatabase and related where possible
Much more narrow interfaces. This code doesn't need more.

Change-Id: Iab0f1da27968246333a4a555b02bfb750cf9eedb
2023-06-14 19:42:01 +00:00
thiemowmde 7e6132d4d7 Remove bits of unused code across the codebase
Mostly found with the code inspection tools in PHPStorm.

Change-Id: I7f59dddca0aaab0ddd1093d52c07ec12efd20d6d
2023-06-14 19:41:00 +00:00
Lucas Werkmeister 9bb4b1e5db Add missing AbuseFilterServices::getHookRunner()
And register AbuseFilterRunnerFactory as a service name that’s allowed
to not have a getRunnerFactory() method without the test complaining
(the service was renamed, getFilterRunnerFactory() exists).

Change-Id: Idedb87e64a6df02b0edae8d9e7dbf441752dc480
Needed-By: If5af88e7f70b83d53f66b9617a5ef37daf81830f
2023-06-14 17:35:43 +02:00
Amir Sarabadani 191e719a79 Fix cases of LogicException in $update->getParserOutputForMetaData()
Abuse filter needs to check both if the update is available and if the
page is rendered. This is the exact issue FlaggedRevs have:
050b9593fb/backend/FlaggedRevs.php (L718)

Bug: T339094
Change-Id: I943c8dbb525dc4c988e97e180474ea71b4cf731d
2023-06-14 13:35:16 +02:00
Matěj Suchánek 8fb53edfbb Retrieve external links from PreparedUpdate
When forFilter is true and PreparedUpdate is available
(most save operations), retrieve all_links from
PreparedUpdate::getParserOutputForMetaData. Otherwise
do what was done before.

Note that this change probably leaves some dead code. It will be dealt
with later.

NOTE: this changes code potentially executed on every save operation.

Bug: T65632
Bug: T264104
Change-Id: I3628a56e5277846c1b90444fb55983870eb54c1e
2023-06-13 14:30:06 +02:00
Matěj Suchánek d82a716ad0 Make old_links retrieval cleaner
The method for old_links retrieval depends on the "forFilter"
value, which we know in advance. If it's true, old_links should
be retrieved from the database. Make a case in the switch
that does nothing but retrieves links from the database,
and direct the evaluation to it.

This change was split from I3628a56e5 to make its review easier.

NOTE: this changes code potentially executed on every save operation.

Change-Id: I33b688f6be3c58beec403f7bf26407a42e7c18ab
2023-06-13 14:03:21 +02:00
jenkins-bot fad3a6e888 Merge "Fix error reporting in BlockedDomainStorage for real" 2023-06-12 21:28:38 +00:00
jenkins-bot 54b9cbd6da Merge "BlockedDomains: Use cleaner array building and add tests" 2023-06-12 18:06:38 +00:00
Amir Sarabadani 60cbc3b464 BlockedDomains: Use cleaner array building and add tests
Regarding array building: Instead of adding to array with
$array[] = 'foo' and then doing array_flip(), simply do
$array['foo'] = true;

Regarding tests: I originally wanted to create a unit test but I ended
up mocking so many things that it wasn't worth it and the config variable
is globaly which first we need to clean up after deployment is done.

Bug: T337431
Change-Id: Iac8dca7078668ee3441d19b6aafe499c1aa0d732
2023-06-12 17:46:55 +00:00
thiemowmde 518955f9c3 Fix error reporting in BlockedDomainStorage for real
This is a direct follow up for I6373fa6 where we apparently fixed
half of the cases while breaking the other half. There was actualy
a code path that can return null, and anther one that can return a
status object.

Since there is never anything done with the status object we can as
well get rid of it and always return null in case of an error.

Bug: T337431
Bug: T279275
Change-Id: I2ccb58756182897bcd6649c9f589e2f7a0321b20
2023-06-12 17:11:49 +02:00
jenkins-bot afaf9d34f8 Merge "Fix broken error reporting in BlockedExternalDomains" 2023-06-12 14:20:20 +00:00
thiemowmde 1eb985c619 Fix broken error reporting in BlockedExternalDomains
Apparently a mistake from I3df949c.

Bug: T337431
Bug: T279275
Change-Id: I6373fa6de561b3018e85f61f5e45ed8c886ce311
2023-06-12 10:52:35 +02:00
thiemowmde 84058c3d96 Make use of the ??= operator and such where it makes sense
We can avoid a bit of code duplication and move code closer together
when it belongs together.

Change-Id: Iffca7e4abfbf03d4663ee909220057bcbd54da75
2023-06-12 10:27:03 +02:00
Amir Sarabadani 9ca20e7749 Make edit summary of blocked domain changes use i18n
It shouldn't be all in English.

Bug: T337431
Change-Id: I57c6b08b652e83baaef41ab0b74af7a4668698a2
2023-06-08 22:06:19 +02:00
Amir Sarabadani 0acfe05251 Add abusefilter-bypass-blocked-external-domains right
This is similar to sboverride right in SpamBlacklist. Defaults are also
the same

Bug: T337431
Change-Id: Iaff91c1f9f7aece0787348dd071701ef99e0291d
2023-06-08 22:06:19 +02:00
Amir Sarabadani 7658885d75 BlockedDomains: Make lookup for domains added in blocked domains faster
We will have a pretty large list of blocked domains that we need to
swift through in each edit for any added domain. In order to cacth
subdomains being added, we have to do all sorts of complicated
operations and string search in large set of strings which is quite
slow. To fix that, let's simply pretend a user who has added
foo.bar.com, also added bar.com and com and do exact match in array of
strings making it much faster.

h/t Krinkle for the idea

Bug: T337431
Change-Id: I96795ed7d1a25f051db0b591dde21b032b138ded
2023-06-08 21:50:43 +02:00
jenkins-bot d6d8608161 Merge "Replace deprecated MWException" 2023-06-07 23:25:54 +00:00
jenkins-bot 90414626fb Merge "Degroup: Return early if user is a temporary user" 2023-06-07 17:18:46 +00:00
Daimona Eaytoy caee78c24d Replace deprecated MWException
These are all unchecked.

Bug: T328220
Change-Id: I8d2f098a8b634d4a226b40ddaef31f0303a0789f
2023-06-07 17:41:20 +02:00
Amir Sarabadani 462096f523 Allow interface-admins to edit blocked domains json directly
For now, we will revisit this in the future. Specially if the
communities think otherwise.

Bug: T337431
Change-Id: I2847264eba9a3cc4fc47a22eacb523199015f9e7
2023-06-06 23:36:12 +02:00
Siddharth VP 8a22007034 BlockedExternalDomains: validate JSON structure before save
This makes raw page editing safer, and potentially enables opening up
access to less restricted user groups.

Bug: T337431
Change-Id: I14f21003a551f34b6e524e9b229613e79b0e5a70
2023-06-06 23:31:28 +02:00
Thalia 573838efc5 Degroup: Return early if user is a temporary user
Treat temporary users the same as IP users. Neither has user groups,
so return early for both.

Bug: T335062
Change-Id: I20b48608cf6ba5f8e8e36a378d66c603d84b032f
2023-06-06 14:10:21 +01:00
jenkins-bot 3feb7d5af0 Merge "BlockedDomains: Put a cache behind parsing of notes of blocked domains" 2023-06-04 15:33:00 +00:00
Amir Sarabadani be928818a4 BlockedDomains: Put a cache behind parsing of notes of blocked domains
It'll be 6K rows in enwiki, parsing 6000 wikitext notes is going to be
expensive.

Bug: T337431
Change-Id: I010d773a7b096c783f5da0d6997d946b3bfd6b6e
2023-06-02 20:13:33 +02:00
jenkins-bot 64ed21cff7 Merge "Use new DeferredUpdatesManager service" 2023-06-01 19:00:42 +00:00
James D. Forrester fb50c1f019 BlockedExternalDomains: Make this a special right, prohibit direct editing
Bug: T337431
Bug: T279275
Change-Id: I96d1e2c8d8728c26e38515032ef773770e26dda4
2023-06-01 09:20:44 -04:00
Amir Sarabadani adae5b95b5 Minor improvements to blocked domain filtering
See I3df949c4d41ce

Follows-Up: I3df949c4d41ce65bb4afa013da9c691ac05fc760
Change-Id: I81974a8d935838e00b4155454f2fb619f8a6bad9
2023-05-31 21:59:45 +02:00
Amir Sarabadani 53eb27f086 Introduce Special:BlockedExternalDomains
It is behind a feature flag. Improvements on it can happen in follow
ups. The patch is already quite massive.

Bug: T337431
Bug: T279275
Change-Id: I3df949c4d41ce65bb4afa013da9c691ac05fc760
2023-05-30 20:48:42 +02:00
Daimona Eaytoy 1c0e558c78 Use new DeferredUpdatesManager service
And remove some hacks for unit tests.

Change-Id: I4e9932a003ac7420f307f01b8d12062fd05a3bb8
2023-05-30 12:50:08 +00:00
Amir Sarabadani e9bec9ffa2 Improve support for read-new wikis with externallinks
Bug: T337149
Change-Id: I68e72243346725fa78281c78dbd6b4cab0b7cbca
2023-05-26 15:47:06 +02:00
jenkins-bot 17cb8ac514 Merge "Update user type checks to handle temporary users" 2023-05-26 11:56:35 +00:00