Commit graph

66 commits

Author SHA1 Message Date
jenkins-bot 58c5edce98 Merge "Add user_unnamed_ip variable" 2024-05-23 18:10:52 +00:00
STran fe0b1cb9e9 Add user_unnamed_ip variable
After temporary accounts are enabled, filters that rely on an ip
in the `user_name` will fail (eg. `ip_in_range` and `ip_in_ranges`).
To keep these filters working:

- Expose the IP through another variable, `user_unnamed_ip`, that can be
  used instead of `user_name`.
- The variable is scoped to only reveal the IPs of temporary accounts
  and un-logged in users.
- Wikis that don't have temporary accounts enabled will be able to see
  this variable but it won't provide information that `user_name`
  wasn't already providing
- Introduce the concept of transforming variable values before writing
  to the blob store and after retrieval, as IPs need to be deleted from
  the logs eventually and can't be stored as-is in the amend-only blob
  store

Bug: T357772
Change-Id: I8c11e06ccb9e78b9a991e033fe43f5dded8f7bb2
2024-05-23 07:19:48 -07:00
xtex bc6240fbda
Replace some deprecated functions
Change-Id: I4070a3655f2fac1d7afe1c3a244a64cb55019b9a
2024-05-04 21:32:04 +08:00
Matěj Suchánek 6526a5494b Simplify computation of derived links variables
Instead of having separate methods for each variable,
have one method which can work not only with "_links",
but with any array of strings.

Change-Id: I05f1b1cbd15f283b314c72259f183f7788e4e214
2024-04-27 10:56:58 +02:00
Matěj Suchánek 68ff668543 Add new variable for last edit time
Bug: T269769
Change-Id: Ia41ecc2f8e6921ef3d5a16fec58202d584ad0727
2024-04-10 23:12:45 +00:00
libraryupgrader a8c9fab2cc build: Updating mediawiki/mediawiki-codesniffer to 43.0.0
The following sniffs are failing and were disabled:
* MediaWiki.Commenting.FunctionComment.MissingDocumentationPublic

Change-Id: I6075c76d53a899aac56af027f9a956a6b9e6a667
2024-03-16 18:53:05 +00:00
Dreamy Jazz 85022190e5 Add user_type variable
Why:
* An AbuseFilter variable is needed that allows filters to determine
  what type the current user is. That is, whether the user is an
  IP address, temporary account, named user or external user.
* Currently filters implement this by inspecting the value in
  the 'user_name' variable, but this is likely to break when
  temporary accounts are enabled as IPs would be hidden.
* Giving a dedicated variable that indicates the type of the user
  allows filters to work out this information without having to
  know the specific username of the user before performing the
  check.

What:
* Add the 'user_type' variable which is lazily computed. It can have
  the value 'named', 'temp', 'ip' or 'external' depending on the
  type of the user. If the user does not match any of these, then
  the value is 'unknown'.
* Replace call to deprecated User::newFromIdentity with a use of the
  UserFactory service that is dependency injected.
* Add and update tests to ensure consistent test coverage.

Bug: T357615
Change-Id: Ifffa891879e7e49d2430a0330116b34c5a03049d
2024-03-07 10:38:25 +00:00
Umherirrender bd84a6514c Use namespaced classes
This requires 1.42 for some new names

Changes to the use statements done automatically via script
Addition of missing use statements and changes to docs done manually

Change-Id: Ic1e2c9a0c891382744e4792bba1effece48e53f3
2023-12-10 23:03:12 +01:00
jenkins-bot 1f1c5e477b Merge "When testing against a page creation in RC, set page_id to 0 as in the real filtering" 2023-09-11 09:28:33 +00:00
Matěj Suchánek 9beeca3752 Fix various typos and documentation issues
Change-Id: I1e9d297f665282d251343598e102e1d342488965
2023-09-04 12:55:17 +02:00
Umherirrender cd7e9d31a7 Use namespaced Title
Bug: T321681
Change-Id: I66fd9b70a5de06ac3c81bdf6a2a5bca64ed094c2
2023-08-19 19:49:36 +02:00
jenkins-bot ad37cd8725 Merge "Get parsed content from PreparedUpdate" 2023-07-18 14:15:40 +00:00
Matěj Suchánek 1e93f8b674 Get parsed content from PreparedUpdate
This finally makes new_html work for non-wikitext contents.

Bug: T264104
Change-Id: I1174b63a8e3a96e83ee7472dd086bfc91636316c
2023-07-16 14:48:30 +02:00
Matěj Suchánek 49edc86a78 Split VariableGenerator::addEditVars
This method actually consists of two: add derived vars, and initialize
content vars. The former part depends on no parameters of this method.
On the other hand, the latter part combines multiple implementations
for some of the content variables using branching.

The branching is a dirty workaround and inferior to the GRASP principle:
"When related alternatives or behaviors vary by type, assign
responsibility for the behavior to the types for which the behavior
varies."
In other words, the callers (extensions) should be responsible for
choosing the initialization strategy themselves, instead of letting
VariableGenerator figure it out.

As the first step, split the former part to a separate method.
For now, it will be implicitly called by ::addEditVars.

Change-Id: I5ff00dbdbf29ec54eabfd95c44a4fd7f713969f5
2023-07-05 14:58:32 +00:00
thiemowmde 24888bea15 Mark protected stuff in classes with no subclasses as private
Protected effectively means "public to subclasses" and should be
avoided for the same reasons as marking everything as public should
be avoided.

Change-Id: Iba674b486ce53fd1f94f70163d47824e969abb77
2023-06-23 12:28:06 +02:00
Amir Sarabadani 191e719a79 Fix cases of LogicException in $update->getParserOutputForMetaData()
Abuse filter needs to check both if the update is available and if the
page is rendered. This is the exact issue FlaggedRevs have:
050b9593fb/backend/FlaggedRevs.php (L718)

Bug: T339094
Change-Id: I943c8dbb525dc4c988e97e180474ea71b4cf731d
2023-06-14 13:35:16 +02:00
Matěj Suchánek 8fb53edfbb Retrieve external links from PreparedUpdate
When forFilter is true and PreparedUpdate is available
(most save operations), retrieve all_links from
PreparedUpdate::getParserOutputForMetaData. Otherwise
do what was done before.

Note that this change probably leaves some dead code. It will be dealt
with later.

NOTE: this changes code potentially executed on every save operation.

Bug: T65632
Bug: T264104
Change-Id: I3628a56e5277846c1b90444fb55983870eb54c1e
2023-06-13 14:30:06 +02:00
Matěj Suchánek d82a716ad0 Make old_links retrieval cleaner
The method for old_links retrieval depends on the "forFilter"
value, which we know in advance. If it's true, old_links should
be retrieved from the database. Make a case in the switch
that does nothing but retrieves links from the database,
and direct the evaluation to it.

This change was split from I3628a56e5 to make its review easier.

NOTE: this changes code potentially executed on every save operation.

Change-Id: I33b688f6be3c58beec403f7bf26407a42e7c18ab
2023-06-13 14:03:21 +02:00
Daimona Eaytoy caee78c24d Replace deprecated MWException
These are all unchecked.

Bug: T328220
Change-Id: I8d2f098a8b634d4a226b40ddaef31f0303a0789f
2023-06-07 17:41:20 +02:00
Jean-Luc Hassec 9369d08773 When testing against a page creation in RC, set page_id to 0 as in the real filtering
Bug: T334617
Change-Id: Id4465cb36131b745d386168e7e158b6bb4d6418c
2023-04-13 08:55:35 +00:00
jenkins-bot 1ff0e96e38 Merge "Replace VariableHolder::$forFilter" 2023-01-05 21:23:24 +00:00
Matěj Suchánek 3e0d1b0d38 Set old_content_model & new_content_model for past changes
We might consider adding an in-process cache because there
will be a duplicate database lookup for content model and
wikitext of the same revision.

Bug: T230295
Change-Id: I9723f21069e03a49fa7131bd8f79c6e7e442104b
2022-12-18 16:01:45 +00:00
Matěj Suchánek dc59cad0a5 Replace VariableHolder::$forFilter
Each generator knows in which situation it is executed, and it
can pass this information to the computer. VariableHolder should
just hold the variables.

Change-Id: I0fb2e01e3e9457cd63948afe2a20439a1c800790
2022-12-02 08:10:15 +01:00
Umherirrender da4bc8643a Use UserIdentity in VariableGenerator::addEditVars
Change-Id: If0a65d7a86de776e6499d43949bfb217f20d9b07
2022-07-29 12:55:52 +02:00
Matěj Suchánek 4beca85154 Compute user and page age relative to recent change timestamp
These are apparently the only two variables for which we can
quickly determine their value in such simple way.

Later, we can also try it for recent contributions.

Bug: T102944
Change-Id: Iecfa9e5c5ba8c078691334b676cc6f289790cb74
2022-06-28 20:53:33 +00:00
Daimona Eaytoy f33bc5868c Set the 'timestamp' var in addGenericVars
This was most definitely my intention when I introduced the concept of
"generic vars", so it's a bit surprising to discover, 3.5 years later,
that the timestamp isn't computed there.

Also make the timestamp always be a string for consistency, since that's
the type documented on mw.org. I've manually checked all filters on
Wikimedia wikis using the timestamp variable, and added explicit int
casts where needed (although I think they'd still work due to implicit
casts).

Change-Id: Ib6e15225dd95c2eead7e48c200d203d6918e0c18
2022-06-26 14:49:40 +02:00
Matěj Suchánek 40c5712311 Pass RecentChange to addGenericVars
Change-Id: Ia843c23b46ecf8e9fe987a538a09e69f845f4488
2022-06-25 13:03:03 +02:00
Matěj Suchánek 58dfab4aeb Fix check for null Content in getEditTextForFiltering
The check was not consistent and the code could still crash
when $oldContent was null. RevisionRecord:getContent only
returns null when audience check fails, but we don't ask
for that.

Change-Id: Id64646a6762167f552e104f623130bedc6b2dd18
2022-04-03 13:06:24 +02:00
Matěj Suchánek 0af21948fc Replace WikiPage::factory in non-test code
Change-Id: I1442ca6603ce5151b98fc88cd84c25af0f34e4f6
2021-09-01 04:55:25 +00:00
Daimona Eaytoy 86257d825c tests: Use DBConnRef, not IDatabase, as retval of getConnectionRef
So that the method can be typehinted in core.

Also add phan-var to fix broken master build due to typehint additions
in core.

Change-Id: I4a072e00ffeeb437753fc3d3c1f15de9929df510
2021-08-31 21:45:10 +02:00
Daimona Eaytoy e9795468c4 Switch filterable actions hooks to the new system
Bug: T261067
Bug: T211680
Change-Id: I0e7e4a48b56c3e5fde56f50693fd0cdc19c30dd0
2021-08-16 14:18:56 +00:00
Lucas Werkmeister a2e42d5050 Don’t generate current content text twice
Previously, for non-newly-created pages, AbuseFilter would get the text
for filtering twice: once in AbuseFilterHooks::filterEdit(), and then
again in RunVariableGenerator::getEditTextForFiltering(). (Plus another
call for the text of the previous revision.) The first copy of the text
is only passed into RunVariableGenerator::getEditVars(), and there only
used if the title doesn’t exist, otherwise it’s overwritten with the
second copy. Instead, let’s make AbuseFilterHooks not get the text at
all, and only get the text from the content when we actually need it
(the content is new).

Change-Id: Id12430fa6ba4643113b945e0d0c01b9c0ee1742f
2021-07-22 13:45:32 +02:00
libraryupgrader 5377ebe819 build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 36.0.0 → 37.0.0

npm:
* postcss: 7.0.35 → 7.0.36
  * https://npmjs.com/advisories/1693 (CVE-2021-23368)

Change-Id: I2b382f3bb236fb44eb24c6a257b13b8fd886541c
2021-07-21 18:51:18 +00:00
Vadim Kovalenko 85be3c57bc Replace RecentChange::getPerformer with RecentChange::getPerformerIdentity
Bug: T276412
Change-Id: I8a55bd5cb17bbc259ec36c40261058e0b46ee4a6
2021-03-15 16:57:40 +02:00
Daimona Eaytoy c5d19577a4 Fix method names of hook interfaces
The hook names contain a dash, which is mapped to an underscore by the
hook runner (see Ie8c8fb603b33ff95c8f8d52f392227f147c528d8), and the
previous method names weren't matching this.

Follow-up: Ic5c82a367e34135bbc0f00ece5aeef4f2d92881b

Change-Id: Ie80b62c49b2f4aaea49d5a1883f513348689d16a
2021-03-09 17:03:14 +00:00
Daimona Eaytoy db09ad81e0 Add a hook to allow computing variables from different types of RC rows
Bug: T115128
Change-Id: Ia6de35b70f491591ea6eb699106ba97c94510091
2021-02-01 14:57:10 +00:00
Daimona Eaytoy ed49f86b74 Make User::get* calls explicit in LazyVariableComputer
With explicit calls it's easier to see what method is being used,
whether it's deprecated, etc. Some methods here are in fact deprecated
or already have a proper replacement, but this is left for a follow-up.

Change-Id: Iee3154855f86c76aab98e7c14250c14e8b9ee939
2021-01-17 00:35:40 +00:00
Daimona Eaytoy 66928eda89 Remove deprecated param
Depends-On: Ie6abd2df5cf1b09c35f0a9e53b0d559e887de09b
Depends-On: Id99b94806095a974a65dd892b2200e59c475802f
Change-Id: I9a33cadb9903461038aa1095be18b68a60dd726d
2021-01-14 23:43:51 +01:00
Daimona Eaytoy 9afc968523 Refactor VariableGenerator and LazyVariableComputer tests
Additionally, avoid building Title objects in LazyVariableComputer, it
just adds a dependency on TitleFactory and creating mocks is more
complicated, but it's pointless because the caller already has a Title
object.

And also stop using Title::getEarliestRevTime(), since the replacement
is easy (we already have a RevisionLookup).

Note for reviewers about renames:
- Code VariableGeneratorDBTest was moved to LazyVariableComputerDBTest,
  RCVariableGeneratorTest, and AbuseFilterVariableGeneratorTest
- AbuseFilterVariableGenerator test was moved into a dedicated
  directory, methods were changed not to test the var values

Change-Id: I3dff8739a9b79f33321d836449b082c3ce63f277
2021-01-09 11:26:24 +00:00
Daimona Eaytoy 4cc608e320 Make HookRunner parameter mandatory in VariableGenerator
Depends-On: Icae3c7cd00bd9be62a46f9e85c311e46157ccabf
Depends-On: Ie5031528593ea28f3cdc3169336aa0e4337306f7
Depends-On: I7ff2f90d890a74fa14f40535c4a567fb3124920e
Change-Id: I4d629e26e31517aa06e6215499cc3422f0fe6c72
2021-01-04 12:47:26 +01:00
Daimona Eaytoy 6081bf90c4 Introduce a VariableGeneratorFactory service
So we can use DI in all generators. Some improvements were deliberately
omitted, e.g. injecting more services and relaxing User/Title to
UserIdentity/LinkTarget, and they'll be included in a subsequent commit.

Depends-On: I1f351071ef2b0b7c80e91407a9c3bb17be293044
Depends-On: Ie71740fac35a86f8fe03023080ae8ca08671243d
Depends-On: I589a0e1c2c5891070ab82cd5adfd9cedec19e67d
Change-Id: I92ef0abd5e45b672e6f297a71b3c2c345d56f136
2021-01-03 14:17:39 +01:00
Daimona Eaytoy 1beb405bc7 Make VariableHolder param optional in VariableGenerator
So we can remove it from callers and then use DI in this class.

Change-Id: I9055f54397279870740a7ff9567635ee4f17e4d2
2021-01-03 13:25:59 +01:00
Daimona Eaytoy 6e27a9ddb3 Cleanup variables-related classes
Change-Id: I20a7fe1a40255043ed0d125dee61ea6052dda69c
2021-01-02 18:19:38 +01:00
Daimona Eaytoy 762d71c51d Create a dedicated namespace for variables-related classes
Some cleanup is left for later to keep the diff easier to read.

Change-Id: Ife445b5e47e707ab77ec867ac3b005866aa74ef2
2021-01-02 18:16:48 +01:00
Daimona Eaytoy fad9a11f7a Add a TextExtractor service
This is an important step towards removing the AbuseFilter class. Note:
proposals for the name of the new service are welcome.

Change-Id: Ib4632173f728b1bdafadef96e01645a833bfceaa
2021-01-01 18:25:32 +01:00
Daimona Eaytoy c7f06750d6 Add a LazyVariableComputer service
See task for a description of the plan. Also note that
AFComputedVariable should be renamed and its properties made private.

This commit includes some adjustments for taint-check in
AbuseFilter::buildVarDumpTable and ::revisionToString.

There's some space for improvement in the new LazyVariableComputer, but
that's left for another commit.

Bug: T261069
Change-Id: Ia44f6e079d39f44cf0122dec5ddb5513ab54f0c6
2020-12-31 14:05:52 +01:00
James D. Forrester 7109c954a2 Use User->isRegistered(), not deprecated isLoggedIn()
Bug: T270450
Change-Id: I6ebf2f8040b6ac53025b5ccf503e5e221341eb09
2020-12-17 18:37:14 -08:00
Daimona Eaytoy f41fe76df7 Kill $wgUser
Depends-On: Iadbce7501e42971901f6d9efcb2810ae42be51d8
Depends-On: I624610cb2372db200995c8d01d62b1d74efca19e
Change-Id: I51d99c30fbc0e87c038013bf5b8c27b1c735e977
2020-12-08 23:23:13 +00:00
Daimona Eaytoy ca3f652cd7 Almost kill the last use of wgUser
This is the last use, and it was a bit harder to remove because it was
buried inside AFComputedVariable. Starting with
I4444cada720ab62d187f2dd0c4760697e465f2ff, we can freely change the
parameters to AFComputedVariable without breaking old log entries.

Note, we still need a fallback for other extensions calling this
method...

Bug: T246733
Depends-On: I4444cada720ab62d187f2dd0c4760697e465f2ff
Change-Id: I5d786a518ef88fad9c8d9c25ef4553a0bf30b2b2
2020-12-08 23:28:24 +01:00
Daimona Eaytoy 1c625eeae4 Drop back-compat code
This should be merged once T246539 is done.

Bug: T213006
Change-Id: I4444cada720ab62d187f2dd0c4760697e465f2ff
2020-12-08 17:15:47 +00:00