Commit graph

56 commits

Author SHA1 Message Date
Isabelle Hurbain-Palatin 3ad70ec182 Replace uses of deprecated ParserOutput::getText()
Bug: T293512
Change-Id: Idf94eb65e0cf915792442458abba618c5feef86b
2024-11-21 14:15:38 +01:00
Umherirrender 57ecef75c5 Use namespaced classes
Changes to the use statements done automatically via script
Addition of missing use statement done manually

Change-Id: If80031678a474157e4cc78a3d3621dab53aded67
2024-10-19 21:55:40 +02:00
Umherirrender c3af3157b4 Use namespaced classes
Changes to the use statements done automatically via script
Addition of missing use statement done manually

Change-Id: I48fcc02c61d423c9c5111ae545634fdc5c5cc710
2024-06-12 20:01:35 +02:00
STran fe0b1cb9e9 Add user_unnamed_ip variable
After temporary accounts are enabled, filters that rely on an ip
in the `user_name` will fail (eg. `ip_in_range` and `ip_in_ranges`).
To keep these filters working:

- Expose the IP through another variable, `user_unnamed_ip`, that can be
  used instead of `user_name`.
- The variable is scoped to only reveal the IPs of temporary accounts
  and un-logged in users.
- Wikis that don't have temporary accounts enabled will be able to see
  this variable but it won't provide information that `user_name`
  wasn't already providing
- Introduce the concept of transforming variable values before writing
  to the blob store and after retrieval, as IPs need to be deleted from
  the logs eventually and can't be stored as-is in the amend-only blob
  store

Bug: T357772
Change-Id: I8c11e06ccb9e78b9a991e033fe43f5dded8f7bb2
2024-05-23 07:19:48 -07:00
jenkins-bot a464cb7453 Merge "Simplify computation of derived links variables" 2024-05-03 07:22:40 +00:00
Umherirrender 6dccb17255 Migrate to IReadableDatabase::newSelectQueryBuilder
Also use expression builder to avoid raw sql

Bug: T312420
Change-Id: I83eb39f1c65a698108ae5bb72f633afda37a9f23
2024-04-30 20:45:51 +02:00
Matěj Suchánek 6526a5494b Simplify computation of derived links variables
Instead of having separate methods for each variable,
have one method which can work not only with "_links",
but with any array of strings.

Change-Id: I05f1b1cbd15f283b314c72259f183f7788e4e214
2024-04-27 10:56:58 +02:00
jenkins-bot 098ff0b880 Merge "Use modern str_starts_with() and [ ... ] syntax" 2024-04-12 09:32:53 +00:00
thiemowmde c9f8343173 Use modern str_starts_with() and [ ... ] syntax
Change-Id: I2f2182e1e0850d7ebf832b7b8e0630ce56aad88b
2024-04-11 14:07:43 +02:00
Matěj Suchánek 68ff668543 Add new variable for last edit time
Bug: T269769
Change-Id: Ia41ecc2f8e6921ef3d5a16fec58202d584ad0727
2024-04-10 23:12:45 +00:00
Dreamy Jazz 85022190e5 Add user_type variable
Why:
* An AbuseFilter variable is needed that allows filters to determine
  what type the current user is. That is, whether the user is an
  IP address, temporary account, named user or external user.
* Currently filters implement this by inspecting the value in
  the 'user_name' variable, but this is likely to break when
  temporary accounts are enabled as IPs would be hidden.
* Giving a dedicated variable that indicates the type of the user
  allows filters to work out this information without having to
  know the specific username of the user before performing the
  check.

What:
* Add the 'user_type' variable which is lazily computed. It can have
  the value 'named', 'temp', 'ip' or 'external' depending on the
  type of the user. If the user does not match any of these, then
  the value is 'unknown'.
* Replace call to deprecated User::newFromIdentity with a use of the
  UserFactory service that is dependency injected.
* Add and update tests to ensure consistent test coverage.

Bug: T357615
Change-Id: Ifffa891879e7e49d2430a0330116b34c5a03049d
2024-03-07 10:38:25 +00:00
Umherirrender bd84a6514c Use namespaced classes
This requires 1.42 for some new names

Changes to the use statements done automatically via script
Addition of missing use statements and changes to docs done manually

Change-Id: Ic1e2c9a0c891382744e4792bba1effece48e53f3
2023-12-10 23:03:12 +01:00
Umherirrender cd7e9d31a7 Use namespaced Title
Bug: T321681
Change-Id: I66fd9b70a5de06ac3c81bdf6a2a5bca64ed094c2
2023-08-19 19:49:36 +02:00
Umherirrender c72b6a20f0 Pass ParserFactory to LazyVariableComputer
Make the init of Parser lazy

Bug: T343070
Change-Id: If0f0ca3c4aa2136c85903289f7f80b95dc5132c8
2023-07-29 14:20:07 +02:00
Matěj Suchánek 1e93f8b674 Get parsed content from PreparedUpdate
This finally makes new_html work for non-wikitext contents.

Bug: T264104
Change-Id: I1174b63a8e3a96e83ee7472dd086bfc91636316c
2023-07-16 14:48:30 +02:00
Tim Starling fe592746b7 Use the new Wikimedia\Diff namespace
Bug: T339184
Change-Id: I381686678524868c85466bdafde3856a73a8cb1c
2023-06-29 11:56:13 +10:00
Matěj Suchánek 8fb53edfbb Retrieve external links from PreparedUpdate
When forFilter is true and PreparedUpdate is available
(most save operations), retrieve all_links from
PreparedUpdate::getParserOutputForMetaData. Otherwise
do what was done before.

Note that this change probably leaves some dead code. It will be dealt
with later.

NOTE: this changes code potentially executed on every save operation.

Bug: T65632
Bug: T264104
Change-Id: I3628a56e5277846c1b90444fb55983870eb54c1e
2023-06-13 14:30:06 +02:00
Matěj Suchánek d82a716ad0 Make old_links retrieval cleaner
The method for old_links retrieval depends on the "forFilter"
value, which we know in advance. If it's true, old_links should
be retrieved from the database. Make a case in the switch
that does nothing but retrieves links from the database,
and direct the evaluation to it.

This change was split from I3628a56e5 to make its review easier.

NOTE: this changes code potentially executed on every save operation.

Change-Id: I33b688f6be3c58beec403f7bf26407a42e7c18ab
2023-06-13 14:03:21 +02:00
thiemowmde 84058c3d96 Make use of the ??= operator and such where it makes sense
We can avoid a bit of code duplication and move code closer together
when it belongs together.

Change-Id: Iffca7e4abfbf03d4663ee909220057bcbd54da75
2023-06-12 10:27:03 +02:00
Daimona Eaytoy caee78c24d Replace deprecated MWException
These are all unchecked.

Bug: T328220
Change-Id: I8d2f098a8b634d4a226b40ddaef31f0303a0789f
2023-06-07 17:41:20 +02:00
Amir Sarabadani e9bec9ffa2 Improve support for read-new wikis with externallinks
Bug: T337149
Change-Id: I68e72243346725fa78281c78dbd6b4cab0b7cbca
2023-05-26 15:47:06 +02:00
Amir Sarabadani 66f79695d4 Use core's externallinks lookup
Depends-On: I8ae9ef388957b0c04efa281f3bc3b5796bec17fe
Bug: T326251
Change-Id: I34b4a151f23f834b695b0abba2982681b79f68e7
2023-04-24 15:12:41 +02:00
Matěj Suchánek 8f6a428f02 Replace deprecated database object access methods
Use the very new getPrimaryDatabase and getReplicaDatabase.
We skip FilterLookup and CentralDBManager in this patch.

Change-Id: I22c6f8fa60be90599ee177a4ac4a97e1547f79be
2023-03-08 16:50:56 +01:00
jenkins-bot 1ff0e96e38 Merge "Replace VariableHolder::$forFilter" 2023-01-05 21:23:24 +00:00
Matěj Suchánek 3e0d1b0d38 Set old_content_model & new_content_model for past changes
We might consider adding an in-process cache because there
will be a duplicate database lookup for content model and
wikitext of the same revision.

Bug: T230295
Change-Id: I9723f21069e03a49fa7131bd8f79c6e7e442104b
2022-12-18 16:01:45 +00:00
Matěj Suchánek dc59cad0a5 Replace VariableHolder::$forFilter
Each generator knows in which situation it is executed, and it
can pass this information to the computer. VariableHolder should
just hold the variables.

Change-Id: I0fb2e01e3e9457cd63948afe2a20439a1c800790
2022-12-02 08:10:15 +01:00
Umherirrender 9c3fc24f85 Use DISTINCT on LazyVariableComputer::getLinksFromDB
A protocol-relative URL has two entries for el_to in externallinks table,
the different is on the el_index colum

Bug: T314373
Change-Id: I3d6229aaa10a089baf15d5ba3407f6a8870429e3
2022-08-02 11:27:31 +00:00
Umherirrender da4bc8643a Use UserIdentity in VariableGenerator::addEditVars
Change-Id: If0a65d7a86de776e6499d43949bfb217f20d9b07
2022-07-29 12:55:52 +02:00
jenkins-bot 1a6985469b Merge "Inline/simplify smaller pieces of duplicate/complex PHP code" 2022-06-03 20:38:22 +00:00
Thiemo Kreuz bbded6231c Inline/simplify smaller pieces of duplicate/complex PHP code
Change-Id: I59d0f17b77c8c3d47bc532bdefd9d8c0883f180b
2022-06-03 21:04:38 +02:00
Matěj Suchánek d4b15cb7ee Optimize loop in 'diff-split' case
The "substr( $line, 0, 1 )" expression has already assumed
the prefix has length 1. Therefore, it's pointless
to compute its length later. The assumption does hold,
the only two prefixes the code works with are '+' and '-'.

Not changing the check to use str_starts_with now, because
it was suggested in I113a8d052b6845852c15969a2f0e6fbbe3e9f8d9
that this shouldn't be done for performance-sensitive code
at least until we are on PHP 8.

Change-Id: I00cb2fc50ed534bb2bbef3ee1e5f6f466afeeb27
2022-05-21 18:07:21 +00:00
Umherirrender 89df7dfddb Remove index detection 'rev_page_timestamp'
Rename in 1.37

Change-Id: Ia9a7682f9f9751de3071b0a644d945dbbd3ed824
2022-04-22 19:26:36 +02:00
Matěj Suchánek 686d7ea88c Use RestrictionStore instead of deprecated method
Also restructure the unit test a bit.

Change-Id: If5ce26f1bc4efdb29653aed3fc47335dddc1e44c
2022-03-29 16:11:55 +02:00
Umherirrender 533e3dc5da Use new namespace for MediaWiki\Revision\RevisionLookup
MediaWiki\Storage is alias since 1.35

Change-Id: I1688cb27847b9154c5133b157ac9c18bd4859a47
2022-02-26 20:39:01 +01:00
jenkins-bot ac0ed20e4f Merge "Improve debug messages of loading ext. links" 2022-02-26 14:26:13 +00:00
Matěj Suchánek 2694751355 Don't implode and explode links
old_links and all_links are an array. Casting
them to string and then splitting by newlines
is a no-op.

Change-Id: I05c69f14e981ac2842032e7db888f4841d6b48b7
2022-01-24 12:58:56 +00:00
Matěj Suchánek 1d31c86ee4 Improve debug messages of loading ext. links
These are not necessarily old links, the new links
can also be retrieved using this code path.
Also print debug messages before the code execution.

Change-Id: I1a85bb7b5a2af4fe514625d2236cf92f15daf304
2021-12-19 14:19:16 +01:00
Daimona Eaytoy 4344d4e438 Update docs after PP limit report core change
The report is now generated in ParserOutput, not Parser, meaning we can
simply avoid passing the `enableLimitReport` option (off by default) if
we don't want the report to be there.

Depends-On: I154c0a77a5b0287b5572614d56339fb57ac56c33
Change-Id: I8cdab35c475f10433234ddb55b5e6a0cc8109498
2021-11-09 13:33:42 +00:00
Matěj Suchánek 83794d7cb4 Clean up Throttle::throttleIdentifier
In 1.37, UserEditTracker was changed to allow anonymous users
as well.

Change-Id: I70d9e6db13416b7c017319ecac3e7e604aacd586
2021-07-22 16:56:12 +02:00
DannyS712 47f861b6f6 Pass a user to WikiPage::prepareContentForEdit()
Bug: T285447
Change-Id: I4d277419106c3af5222377a863c80dd866ba188b
2021-06-24 04:01:33 +00:00
Daimona Eaytoy 57f11631ba Pass a valid regexp to preg_match in checkRegexMatchesEmpty
Bug: T283966
Change-Id: I99688aa8f3e62e410392a9142df56b1a3c708987
2021-05-29 11:38:07 +00:00
Ammarpad 6a799ec9c5 Check forcing of page_timestamp revision index
Bug: T270033
Change-Id: I16fc273b14e7f4b00e8c31ec1ed7712149aafe37
2021-04-30 13:06:43 +01:00
DannyS712 db8d373a87 LazyVariableComputer: update parseNonEditWikitext documentation
Article::prepareContentForEdit is deprecated and being removed,
refer to WikiPage::prepareContentForEdit instead

Plus remove an extra line

Change-Id: Ie4438c710639a16557816b53510ce230d15d641c
2021-03-24 17:32:31 +00:00
Daimona Eaytoy 8b81df4d16 Fix fatal when computing user_editcount for anons
UserEditTracker checks that the user is not anonymous, whereas
User::getEditCount() would just return null. This was not spotted by
tests because UserEditTracker is mocked.

Bug: T277859
Follow-up: I8a55bd5cb17bbc259ec36c40261058e0b46ee4a6
Change-Id: I05fb6cc780c80b72b3278e6dc670ed2025628ffb
2021-03-19 13:09:03 +01:00
Vadim Kovalenko 85be3c57bc Replace RecentChange::getPerformer with RecentChange::getPerformerIdentity
Bug: T276412
Change-Id: I8a55bd5cb17bbc259ec36c40261058e0b46ee4a6
2021-03-15 16:57:40 +02:00
jenkins-bot 577aa83309 Merge "SECURITY: Avoid deleted usernames leak in page_recent_contributors" 2021-03-09 22:50:20 +00:00
Daimona Eaytoy f25c96f472 SECURITY: Avoid deleted usernames leak in page_recent_contributors
Bug: T71367
Change-Id: I8d5ed9ca84282ee50832035af86123633fc88293
2021-03-09 15:56:09 -06:00
Daimona Eaytoy c5d19577a4 Fix method names of hook interfaces
The hook names contain a dash, which is mapped to an underscore by the
hook runner (see Ie8c8fb603b33ff95c8f8d52f392227f147c528d8), and the
previous method names weren't matching this.

Follow-up: Ic5c82a367e34135bbc0f00ece5aeef4f2d92881b

Change-Id: Ie80b62c49b2f4aaea49d5a1883f513348689d16a
2021-03-09 17:03:14 +00:00
Daimona Eaytoy e64049c30b Create dedicated types of parser exceptions
Introduce a clear distinction between internal exceptions and
user-visible exceptions, leaving AFPException as base abstract class.

Later, it should be possible to narrow some types around, e.g. in
ParserStatus (that might work with user-visible exceptions only).

Also a future TODO is putting all the exceptions in their own namespace
(probably ...\Parser\Exception).

Change-Id: I4e33a45117f0a3e73af03cc1e3f2734beaf2b5e1
2021-02-12 13:56:02 +00:00
Daimona Eaytoy 28bd23f38d Skip regexp validation if the regex is (partly) unknown
Bug: T273809
Change-Id: Ib8ab29ad69088baf5b826d9cdada0ded29a58871
2021-02-04 15:16:22 +00:00