Why:
* When CheckUser asks the AbuseFilter extension for modifications
to rows inserted into the CheckUser tables, the AbuseFilter
extension attempts to get the Filter user via User::newSystemUser
* User::newSystemUser can deadlock if multiple requests to create
the system user are being made at once.
* The CheckUserHander does not need to create the abuse filter system
and instead only needs to know if a given $user is the equal to
the FilterUser.
* As such the FilterUser service needs to provide a way to check if
a given $user is equal without creating the FilterUser.
What:
* Add FilterUser::isUserSameAs which returns a boolean value
indicating whether the Abuse Filter system user is the equal
to a given UserIdentity in the same way that UserIdentity::equals
is implemented.
* Refactor ::getUser to get the username for the filter user in
a separate method, so that the ::isUserSameAs method can also
use this method. Name this new method ::getFilterUserName.
* Add a test for the FilterUser service to ensure consistent test
coverage
* Convert the @covers and @coversDefaultClass annotations to be
a @covers for the class. This is because PHPUnit recommends this in
https://docs.phpunit.de/en/9.6/annotations.html#appendixes-annotations-covers-tables-annotations
Bug: T356275
Bug: T346967
Change-Id: I8a101781bb47612deabb0f2a06a398ac13e860e6
We are getting rid of the schema of implementing this interface and
calling self::READ_* constants, it's confusing, inconsistent, prone to
clashes and isn't really useful for non-ORM systems (which we are not)
Bug: T354194
Change-Id: I5d7a2c91a49311a6bdf6e56053c08610d4d6d110
Why:
* The AbuseFilter Special:AbuseLog/hide page has a form that allows
those with sufficent rights to hide abuse filter log entries.
* This form defines a custom action, which uses a URL including the
wgServer by calling ::getFullUrl.
* When on WMF wikis and using mobile view, the domain name includes
'm' and as such the wgServer is not the correct URL for the form
action in this case.
* HTMLForm by default uses ::getLocalUrl for the action and as such
Special:AbuseLog/hide should also use ::getLocalUrl to prevent
these issues.
What:
* Change the call to ::getFullUrl in HideAbuseLog::show for the
action text for the HTMLForm instance to instead be a call to
::getLocalUrl.
Bug: T355012
Change-Id: I6c909d5e6724dd620cf656c9a55439ed5d5c2fb4
This field gets added automatically when using the special page form but
is only shown to admins and other people who have access. It's not private
information (users can find it in history) but this is to avoid making
these admins an easy target for harassment (Talking to PM of moderation
team he agreed this is a good compromise).
Bug: T341626
Change-Id: I8410f39db54b96981b05de8e064fed65df30ef2f
- Mentions filter number and name in the title
- Distinguishes between viewing and editing
Bug: T353106
Change-Id: Idda9854a78937033b168603810154b48288c3f4c
This requires 1.42 for some new names
Changes to the use statements done automatically via script
Addition of missing use statements and changes to docs done manually
Change-Id: Ic1e2c9a0c891382744e4792bba1effece48e53f3
The BadMethodCallException is documented as "thrown if a callback
refers to an undefined method or if some arguments are missing".
This is not what happens in these places.
Change-Id: Ic95b67acc2e17eea1dd0fa1d72f9ac94a86bcf17
This leverages the new BlockedExternalDomains system that is now part of
AbuseFilter. It notifies editors in realtime if a link they add is
blocked. See https://w.wiki/7ZsF for more information.
BlockedExternalDomains is slated to have its own API tantamount to the
action=spamblacklist endpoint, after which case this code will need to be
updated. In the meantime, it's meant to serve as a minimal viable product
for the CWS 2023 wish <https://w.wiki/7ZsE> for wikitext users.
The new $wgAbuseFilterBlockedExternalDomainsNotification configuration
setting controls the availability of this feature.
A similar feature for VisaulEditor is tracked at T276857
Bug: T347435
Change-Id: I7eae55f12da9ee58be5786bfc153e549b09598e7
The new method formats the message with Message::escaped() which
better protects from bad HTML in the message.
The ::setPageTitleMsg() method was added in 1.41 and this extension
already requires MW >= 1.41.
Bug: T343994
Change-Id: Ic07cde3bafeaa0325024fe89b4948680d04c4820
Reverts part of Id9056528a433faf0, to switch to DeferredUpdates in
CirrusSearch back from DeferredUpdatesManager.
Bug: T265749
Change-Id: I8126cc76440724753c356c48ba4e0fcc9be5b41a
Most notably, make it not use additional DB tables to test global
filters. Instead, just pretend that the local database is not local (via
config) and "hide" local filters with a simple test-only flag in
FilterLookup.
Change-Id: Ib431dbf6c9d84978ee84e7f0671cfcbf8a54d7a2
TestUser requires a DB connection, so avoid using it in database-less
tests. Add to the Database group tests that are making DB writes (e.g.,
for log entries).
Change-Id: I211cb60296e5c2446128fcdf2caaadc728a8c272
I'm planning to add support for bypass and regex-based blocking which
means it'll grow a bit. So let's give it a dedicated class.
Bug: T337431
Change-Id: I5a6fe2fd2f1efdebd8cada0ba6c481341f830e27
This method actually consists of two: add derived vars, and initialize
content vars. The former part depends on no parameters of this method.
On the other hand, the latter part combines multiple implementations
for some of the content variables using branching.
The branching is a dirty workaround and inferior to the GRASP principle:
"When related alternatives or behaviors vary by type, assign
responsibility for the behavior to the types for which the behavior
varies."
In other words, the callers (extensions) should be responsible for
choosing the initialization strategy themselves, instead of letting
VariableGenerator figure it out.
As the first step, split the former part to a separate method.
For now, it will be implicitly called by ::addEditVars.
Change-Id: I5ff00dbdbf29ec54eabfd95c44a4fd7f713969f5
Service wiring should only depend on config, not on request state.
Creating a session object during service wiring causes issues with entry
points such as opensearch_desc.php that disable the session.
Bug: T340113
Change-Id: I2450b0b6821ff0b097e283ff660a0b8aeea9590a
For example:
* Use the more meaningful str_contains().
* Add missing type hints.
* Make use of early returns/guard clauses.
Change-Id: Id150d1b17a80ea637a0639a8f2fd7fd017ad23b1
Protected effectively means "public to subclasses" and should be
avoided for the same reasons as marking everything as public should
be avoided.
Change-Id: Iba674b486ce53fd1f94f70163d47824e969abb77
This was an unfortunate mistake in the refactoring in I2ccb587,
caused by incomplete documentation and a confusing mixture of
possible return types.
I9166c2b fixed one of the two places already. The situation in this
patch here cannot really happen in reality (there is nothing to
remove when the page is empty). Still I think the code is easier to
read when the two places behave the same.
Change-Id: Iea51c3a7a8185cbc3771143353f4795dde712ec4
It should fail on null but it should create the page if it doesn't
exist or doesn't have any content yet.
This is breaking the special page, see:
[[de:234828092#New_special_page_to_fight_spam_//_Neue_Spezialseite_zur_Spam-Bekämpfung]]
Change-Id: I9166c2bdcfacb4b19706d246fbf99b2f24ca4cc6
Unlike what the 20-year old source comments in UrlUtils.php would
have you believe, parse_url() works fine nowadays, including for
protocol-relative URLs and indeed lots of prod code uses it directly.
The class still has some convenience value for case where you need to
expand or manipulate URLs, but for the common case of extracting a part
of it, you really don't need it.
Test plan:
$ php phpunit.php ../../extensions/AbuseFilter/tests/phpunit/integration/FilteredActionsHandlerTest.php
Bug: T337431
Change-Id: I1e76d2f5aef65365743214530faba656325b965a
* Remove stray `@ingroup` from file blocks, move to class block.
* Fix mention of "WAN" cache where actually APCU is used.
* Document that the storage class takes a local-server cache.
This is an important requirement since the class has no
coordination for purging or other invalidation. It expects
an uncoordinated cache.
* Rename "load" to "loadConfig" as it's ambigious what it means among
the half dozen other "load*" methods in this class. Also inline
loadFromConfig and loadComputedUncached while at it to further
reduce this.
* Rename "loadConfigContent" to "fetchLatestConfig" to match
the existing fetchConfig, which does the same thing except it queries
the primary db using READ_LATEST.
* Use Html.php when building HTML, instead of legacy Xml.php.
While at it, also switch a few to Html::element instead of
Html::rawElement (aka Xml::tags) by using Message->text() for
messages that are not expected to contain rich wikitext.
Change-Id: Ic74d1597aa9201b371894e7a4bf9361752d9db21
Doing unset on array leads to the final array turning into associative array
and gets blocked by the validator.
You can check that it's broken in Persian Wikipedia, beta cluster or
localhost. Tested locally, fixes the issue.
Bug: T337431
Change-Id: Ib1be294bae1ae057dfb9a4445a8e13ac72b333b9
This is basically copy paste of SpamBlacklist logging with the added
extra bit of what triggered the hit.
Bug: T337431
Change-Id: Ieb9e3ca615af88ab56735b56e24c80c42a68d478
And register AbuseFilterRunnerFactory as a service name that’s allowed
to not have a getRunnerFactory() method without the test complaining
(the service was renamed, getFilterRunnerFactory() exists).
Change-Id: Idedb87e64a6df02b0edae8d9e7dbf441752dc480
Needed-By: If5af88e7f70b83d53f66b9617a5ef37daf81830f
Abuse filter needs to check both if the update is available and if the
page is rendered. This is the exact issue FlaggedRevs have:
050b9593fb/backend/FlaggedRevs.php (L718)
Bug: T339094
Change-Id: I943c8dbb525dc4c988e97e180474ea71b4cf731d
When forFilter is true and PreparedUpdate is available
(most save operations), retrieve all_links from
PreparedUpdate::getParserOutputForMetaData. Otherwise
do what was done before.
Note that this change probably leaves some dead code. It will be dealt
with later.
NOTE: this changes code potentially executed on every save operation.
Bug: T65632
Bug: T264104
Change-Id: I3628a56e5277846c1b90444fb55983870eb54c1e
The method for old_links retrieval depends on the "forFilter"
value, which we know in advance. If it's true, old_links should
be retrieved from the database. Make a case in the switch
that does nothing but retrieves links from the database,
and direct the evaluation to it.
This change was split from I3628a56e5 to make its review easier.
NOTE: this changes code potentially executed on every save operation.
Change-Id: I33b688f6be3c58beec403f7bf26407a42e7c18ab
Regarding array building: Instead of adding to array with
$array[] = 'foo' and then doing array_flip(), simply do
$array['foo'] = true;
Regarding tests: I originally wanted to create a unit test but I ended
up mocking so many things that it wasn't worth it and the config variable
is globaly which first we need to clean up after deployment is done.
Bug: T337431
Change-Id: Iac8dca7078668ee3441d19b6aafe499c1aa0d732
This is a direct follow up for I6373fa6 where we apparently fixed
half of the cases while breaking the other half. There was actualy
a code path that can return null, and anther one that can return a
status object.
Since there is never anything done with the status object we can as
well get rid of it and always return null in case of an error.
Bug: T337431
Bug: T279275
Change-Id: I2ccb58756182897bcd6649c9f589e2f7a0321b20
We will have a pretty large list of blocked domains that we need to
swift through in each edit for any added domain. In order to cacth
subdomains being added, we have to do all sorts of complicated
operations and string search in large set of strings which is quite
slow. To fix that, let's simply pretend a user who has added
foo.bar.com, also added bar.com and com and do exact match in array of
strings making it much faster.
h/t Krinkle for the idea
Bug: T337431
Change-Id: I96795ed7d1a25f051db0b591dde21b032b138ded
For now, we will revisit this in the future. Specially if the
communities think otherwise.
Bug: T337431
Change-Id: I2847264eba9a3cc4fc47a22eacb523199015f9e7
This makes raw page editing safer, and potentially enables opening up
access to less restricted user groups.
Bug: T337431
Change-Id: I14f21003a551f34b6e524e9b229613e79b0e5a70
Treat temporary users the same as IP users. Neither has user groups,
so return early for both.
Bug: T335062
Change-Id: I20b48608cf6ba5f8e8e36a378d66c603d84b032f
It is behind a feature flag. Improvements on it can happen in follow
ups. The patch is already quite massive.
Bug: T337431
Bug: T279275
Change-Id: I3df949c4d41ce65bb4afa013da9c691ac05fc760
* Set the same block expiry for temp and anon users
* Don't block autopromote for temp users; they can't be autopromoted
* Bail early from CheckUserHandler if the user is temporary
Bug: T335062
Change-Id: I6b72537f568c4c70a0b86f1825ea30b767f5634a
This patch migrates abuse_filter and abuse_filter_history tables
to new actor schema.
MigrateActorsAF was copy-pasted from core's
maintenance/includes/MigrateActors.php before removal (ba3155214).
Bug: T188180
Change-Id: Ic755526d5f989c4a66b1d37527cda235f61cb437
Looks like it's needed for the UpdateVarDumps script, so add a note
about that. Also add a type check to the script so that it produces a
clearer error message if it finds an entity with unexpected type.
Bug: T331861
Change-Id: I68f8f954ed754c4282e13599ce06118e2336ecbb
Use the very new getPrimaryDatabase and getReplicaDatabase.
We skip FilterLookup and CentralDBManager in this patch.
Change-Id: I22c6f8fa60be90599ee177a4ac4a97e1547f79be
Increase default widths from `65%` to `90%` for the editor, notes,
description, group inputs.
Add `mw-abusefilter-edit-description-input` id to
`abusefilter-edit-description` TextInputWidget.
Bug: T294856
Change-Id: Ia9472298170740a39fd24864003b766078fcdfaf
Hook on to CheckUserInsertPrivateEventRow and CheckUserInsertLogEventRow
to override the IP, XFF and User-Agent string when the user is the
abuse filter user for log events.
These two hooks are being added as log entries are being removed from
cu_changes and added into two new tables. Because the columns and their
names are different for these tables, reusing the same hook won't work
for callers that rely on setting values for a specific column name.
Edits and log entries performed by the abuse filter user need to be
marked as being by the software (and not using the IP, XFF and
User-Agent provided in the main request).
These hooks will not be run until the appropriate config is set to
write to the two new tables. Until that point using the one currently
defined hook will work for all actions.
Bug: T324907
Bug: T44345
Depends-On: I7c7754323ade9a8d96273c1742f30b1b5fbe5828
Follow-Up: Idd77545af94f9f9930d9ff38ab6423a72e680df9
Change-Id: Id78417e9d95220946f110afbe1430df5b3bb4f4f
Add RecentChange as a optional parameter to the code that hooks
on CheckUserInsertChangesRow as this hook will soon provide a
RecentChange object if this row was triggered by a RecentChange.
If this row was not triggered by a RecentChange, then this
parameter will be null. This needs to be added before the parameter
is added to the definition of the hook in the CheckUser extension
as the tests will fail if all usages do not already have the new
parameter.
Bug: T324907
Change-Id: I44e54a3fca5558a1cb8d8f06a3990ded863454bc
We might consider adding an in-process cache because there
will be a duplicate database lookup for content model and
wikitext of the same revision.
Bug: T230295
Change-Id: I9723f21069e03a49fa7131bd8f79c6e7e442104b
The motivation is to have a single immutable object providing
information about the action. It can represent the current
action being filtered, but also a past action stored in the
abuse log. It will hopefully help us get rid of passing
User(Identity) and Title/LinkTarget objects around together.
Change-Id: I52fa3a7ea14c98d33607d4260acfed3d3ba60f65
Each generator knows in which situation it is executed, and it
can pass this information to the computer. VariableHolder should
just hold the variables.
Change-Id: I0fb2e01e3e9457cd63948afe2a20439a1c800790
For fixing bugs like T65632, T105325, or T264104, we will need
to update code in more than one place at once. To prevent
regressions, create an integration test which tests the whole
pipeline, from the request submission to variable evaluation.
Edits are simulated using action=edit API call because the hook
AbuseFilter uses is run from EditPage.
To increase confidence in test coverage, remove some annotations
from AbuseFilterConsequencesTest or make them less greedy.
Ideally, it would only test consequences.
This patch includes refactoring of AbuseFilterCreateAccountTestTrait
which now only inserts the user into the database if it really
should be created.
It also restores test coverage of some other classes.
Change-Id: I661f4e0e2bcac4770e499708fca4e4e153f31fed
Core now supports special pages registering sub menus
natively in skins. The menu is rendered when the skin
supports it, so at current time of writing this will
only work in Vector 2022 and MinervaNeue.
The existing menu that appears under the abuse filter
page title is converted into the new format. For other skins
no difference.
Bug: T315553
Change-Id: Ief51a9c60125c11e3b735fabee2a4544b7955f64
Change the IP to 127.0.0.1 (to indicate an internal IP), and blank
the XFF and UA when the performer of an action being logged by
CheckUser is the abuse filter user. Actions performed by the abuse
filter user can only be initated by the software, and as such should
not use the request's IP, XFF and UA. Also test the newly added
code.
Bug: T44345
Depends-On: I28acaaebd2d0067b700da0930e7b7ba924fa5c1c
Change-Id: Idd77545af94f9f9930d9ff38ab6423a72e680df9
$wgAbuseFilterAnonBlockDuration is documented to be deprecated and
fall back to $wgAbuseFilterBlockDuration. This was just missing here.
This makes code fail in PHP 8.x where null is not allowed any more in
functions that expect a string.
Change-Id: I0edb0f14630aed88635aa564a11d6f42e470c29f
PHP does this automatically, however in PHP8 this causes an
E_DEPRECATED warning.
This fixes a phpunit test
Change-Id: Ie2b2dbf4a1c0ff500ba251ee43a37823432e3047
Follows-up Iaa1b4683c5c856.
* Match $IP pattern verbatim from most other WMF extensions.
* Improve descriptions a bit, and move/merge any meaningful
information from file docblock into class docblock. The file blocks
are visually ignored and identical in each file, and often out of
date or duplicated when given text separately from the class block.
See also similar changes in core:
https://gerrit.wikimedia.org/r/q/message:ingroup+owner:Krinkle
* Use `@internal` instead of `@private` as per Stable interface
policy.
Change-Id: I8bed9a625af003446c7e25f6b794931164767b5a
A protocol-relative URL has two entries for el_to in externallinks table,
the different is on the el_index colum
Bug: T314373
Change-Id: I3d6229aaa10a089baf15d5ba3407f6a8870429e3
In theory, it's possible that some consequences could use "0"
as one of their parameters. At least change tags, see T296642.
But PHP treats "0" as false.
Also make the code on all places consistent.
Change-Id: I5255dfb26878ceb4f78c4d8277521edbb4821d7d
We mask the IP address on purpose, so that it is not
leaked to the abuse log. This breaks CheckUser because
it attempts to assign an actor id to the "fake"
(uncreated) user account. So unmask the IP address
when we send the data to CheckUser.
Bug: T233004
Change-Id: Ib58193927bc8254d36a8de0fd1b5f9fba68a0cb0
A unique column is needed in the order to ensure the next offset is
correct and does not skip items
As mention in the doc for IndexPager::getExtraSortFields the extra
columns are not for pagination, only to help the optimizer building a
better query plan by mention denormalized columns.
Bug: T191694
Change-Id: I9fb9f848a0b165dbaa0a2b31d9504324f43578de
Combine the check for red/blue user/talk links into one database query
This can improve the performance of the page view when many filters
from many different users are linked
Change-Id: I0b87ee15ecee4cecd5d5d6164e8c18e1b788ecd1
They are unused in Wikimedia code (finally).
Change-Id: I74c81d950d992552d3edf184b5eecc46e5e2c567
Depends-On: I62533e21d2bc1a22c3fcba4c7c650ca9d95700ef
Depends-On: I95ce9897d89213e358c436135278b729f0adc3a2
Add a space between the checkbox (shown for users who can hide abuse
filter entries) and the timestamp so that it looks nicer.
Change-Id: I6e495f8cb56ad8f0b53f06d2aecb8ac34b16ff25
A new core facility written for this use case.
Bug: T310662
Depends-On: I26b1cdba0a06ad16ad8bb71b455e1b6180924d17
Change-Id: I2b902d034a8c3308c0ba9878b69e873ca8fbda52
Add the ListToggle provided in core's ListToggle.php to
Special:AbuseLog when a list of abuse filter entries are being
shown and the user can hide abuse filter entries. This will allow
them to select multiple checkboxes to hide at once (without having
to shift and selecting the first and last).
Bug: T311954
Change-Id: I1aa4fa3fa7016a5d9ae4a904c151011743d2c8ed
Move most stuff from the pager to the view class to untangle
circular dependency. Declare class properties as private.
Leave input validation to the form.
Change-Id: Ia8b1a9d08af9c0cac23b34f6bbbe2c44d01f6c8c
In action=abusefilterunblockautopromote, leave UserIdentity
instantiation to the parent. Note that this changes the "code"
in the response from "baduser_user" to "baduser".
Change-Id: I97d2bf3fa3c5486e461823f840cad2763e1bcfea
Almost all callers already provide an Authority in the form
of a User object, so mostly just need to change the typehints
Depends-On: I58661943c7e1acb6ff09798ee1a30be0fde3f459
Change-Id: I2ad86859c8194c14d7331f58db62b7cff4698085
Prevent invalid assignments to properties. On
Special:AbuseFilter/test/123, handle when id of
a non-existing filter was provided. Allow '0'
as user and title on Special:AbuseLog and
Special:AbuseFilter/test.
Change-Id: I196ae62b165d1a60babaf4fe6bd733aa52be1726
These are apparently the only two variables for which we can
quickly determine their value in such simple way.
Later, we can also try it for recent contributions.
Bug: T102944
Change-Id: Iecfa9e5c5ba8c078691334b676cc6f289790cb74
This was most definitely my intention when I introduced the concept of
"generic vars", so it's a bit surprising to discover, 3.5 years later,
that the timestamp isn't computed there.
Also make the timestamp always be a string for consistency, since that's
the type documented on mw.org. I've manually checked all filters on
Wikimedia wikis using the timestamp variable, and added explicit int
casts where needed (although I think they'd still work due to implicit
casts).
Change-Id: Ib6e15225dd95c2eead7e48c200d203d6918e0c18