Commit graph

282 commits

Author SHA1 Message Date
jenkins-bot 08c0a3f482 Merge "ViewEdit: add af_id to the row" 2020-02-26 16:53:35 +00:00
jenkins-bot b28e9e8c1f Merge "Start using new format for var dumps" 2020-02-26 16:51:02 +00:00
jenkins-bot 22adab7159 Merge "Stop using the Revision class" 2020-02-26 16:47:37 +00:00
Daimona Eaytoy f454e60e83 Start using new format for var dumps
Migrating old log entries is I22cf698c5be77506727cbd227c67e037a5d89b5c.

Bug: T213006
Change-Id: I3242acd5c5163a941f584d6119e3ad3b3cad8c29
2020-02-26 16:03:38 +00:00
Daimona Eaytoy 518c176754 Stop using the Revision class
Change-Id: Ie257c9b1ea94dcadce59f4541d5947465262bd75
2020-02-26 15:39:12 +00:00
Daimona Eaytoy fe28aff82a ViewEdit: add af_id to the row
A PHP notice is sporadically emitted in production, e.g. reqId XlVEMgpAMNAAA6zMVhQAAACV

Change-Id: Ie42d00c6520aa31daf127c5df9515a3ab01d986f
2020-02-26 15:27:54 +00:00
Daimona Eaytoy b9a1e86245 Remove old number syntax
Bug: T212730
Change-Id: I7573da1683efc83b5002b8948c97dd7f6658a488
2020-02-25 23:38:19 +00:00
Daimona Eaytoy 1bac110205 Remove dependency on $wgRestrictionTypes
This was used to dynamically generate *_restriction_* variables.
However, it had two big problems:
 - We only have i18n for 'create', 'move', 'edit', and 'upload' (the
 default value of the global); other restrictions would show missing
 messages in various pages.
 - We had to access the global state in various points.

This change also makes some code in AbuseFilterVariableHolder simpler,
and also allows us to make AbuseFilterTest a unit test.

Change-Id: I321ad6e07f8243200af67a581b6e485970efd3ce
2020-02-25 23:17:54 +00:00
Daimona Eaytoy bef72c7dc3 tests: Use ChangeTags::getTags instead of hardcoded queries
This is the "clean alternative" mentioned in the method comment!

Change-Id: Ieb87a1f512c930c2e33e721ba792986bc198e414
2020-02-22 13:54:23 +00:00
jenkins-bot 76a1be97a4 Merge "Add site name and language variables" 2020-02-10 19:06:01 +00:00
Daimona Eaytoy 0d2cab0deb Validate imported data
At the moment there's no validation for import data, so it's totally
possible to insert rubbish in the field, and the code will produce other
rubbish. For instance, it's not so uncommon to see lots of PHP notices
on logstash for ViewEdit code trying to access members of the imported
data as if it were an object.

Change-Id: If9d783f0f9242d3d1bc297572471e62f51ee0e40
2020-02-10 18:41:36 +00:00
Daimona Eaytoy d9ae71f578 Add site name and language variables
In T43172 it was told that adding the site name could increase the risk of
attracting more spam, but I don't see how this variable could cause that.

Bug: T240948
Bug: T97933
Change-Id: I1d2aeabaf008ac06798b8d7e4af7d61ae1702776
2020-02-09 14:32:02 +01:00
jenkins-bot 391bbee53c Merge "Fix some edge cases in ViewEdit" 2020-02-08 16:36:19 +00:00
jenkins-bot c0b58d7699 Merge "Factor out variables-related methods" 2020-02-08 14:42:13 +00:00
Daimona Eaytoy 0834f37e42 Fix some edge cases in ViewEdit
Follow-up Iabd0ae5b18571f8cad44ef2d86bcf2519e7f95ba.

This patch:
 - Moves some save-related code to a separate method
 - Reduces conditionals nesting
 - Fixes an edge case where the content of the form would be
 wiped in case the token didn't match.
 - Adds another (basic) selenium test
 - Standardizes return types
 - Moves data load outside of buildFilterEditor

Change-Id: I89444b59f04c495c9ab59244151c8ed5d38cf0fe
2020-02-08 15:35:46 +01:00
jenkins-bot 430058f2c0 Merge "Avoid keeping superfluous row properties" 2020-02-07 21:26:34 +00:00
jenkins-bot 02cd866f53 Merge "Refactor data load in ViewEdit" 2020-02-07 21:26:32 +00:00
Daimona Eaytoy 3f83e57ad7 Factor out variables-related methods
This is another step needed to reduce the size of the gigantic
AbuseFilter and AbuseFilterHooks classes. It also makes many methods
non-static, for more testability.

Note, this layout is still not final. We should somehow merge the
functionality of VariableGenerator and AFComputedVariable, for which
I already have plans.

Change-Id: I366d598b69ad866496b7cb0059e0835c02e54041
2020-02-07 20:27:26 +00:00
Daimona Eaytoy 1686042a91 Move variable generators to new classes
RunVariableGenerator is for generating variables based on the current
action;
RowVariableGenerator is for RC entries;
VariableGenerator is the generic one.

This patch only moves the methods to the new classes, to keep the diff
easier to read, and facilitate conflict resolution. These classes will
then be revamped in I366d598b69ad866496b7cb0059e0835c02e54041.

Note that these classes are now namespaced.

One method, AbuseFilter::getEditVars, was renamed to
AbuseFilterVariableGenerator::generateEditVars, because it would
otherwise conflict with an incompatible method in RunVariableGenerator.

Change-Id: Iff412e5492873d4fae55402939a51609e64d55a8
2020-02-07 19:44:31 +00:00
Daimona Eaytoy 472d1221bd tests: Increase and rebalance code coverage
Also fix a couple of broken tests in Consequences:
 - For createaccount, $user->addToDatabase must be called before
   testForAccountCreation, or it will throw a CannotCreateActorException.
 - In testThrottleLimit, also set wgAbuseFilterEmergencyDisableThreshold
   to avoid relying on the local config.

Bug: T201193
Change-Id: If1a50b0a729e4d554485f2e2225d5877510966b6
2020-02-07 18:32:17 +00:00
Daimona Eaytoy 102789f62a Avoid keeping superfluous row properties
Most of them are overwritten either in ViewEdit::loadRequest or
AbuseFilter::saveFilter. af_hit_count and af_throttled are actually
relevant for the old version, so list them explicitly. And also add
default af_group and af_global, which are later read, for import action.

Depends-On: Iabd0ae5b18571f8cad44ef2d86bcf2519e7f95ba
Change-Id: Ie9aae938cca06e38a7a834a3f74f3e8735ab01ee
2020-01-23 12:50:03 +00:00
Daimona Eaytoy 53b9f38888 Refactor data load in ViewEdit
Instead of having a single loadRequest method (which could end up
loading from the DB...), split it in a DB-only method and a request-only
one. Simplify the logic used to show the filter editor. Show the page
without changes or warnings if the user lost editing rights in the
meanwhile. Avoid two static properties, and pass them in when relevant
instead. Bonus: optimize a query to sort by afh_id instead of afh_timestamp to avoid filesort.

This will allow a subsequent patch to clean the $row object in
loadRequest.

Change-Id: Iabd0ae5b18571f8cad44ef2d86bcf2519e7f95ba
2020-01-21 14:15:41 +01:00
Daimona Eaytoy e9fe252def Fix remaining PHPCS issues
Mainly, add visibility modifiers on constants.

Change-Id: I41e8e2d691b2bad6ea6f244d54517d37d7783181
2020-01-21 12:36:37 +00:00
Max Semenik 8e7230076e Fix PHPUnit 8 warning
Bug: T192167
Change-Id: Ifbebbc3467eb0bf3f12cffc9e5601a1c94327bd9
2020-01-20 15:47:45 +00:00
Daimona Eaytoy 44ea3aa7f4 Fix generation of HTML vars, simplify tests
-new_html: also strip the "Transclusion limit" comment if present, and
anyway take it into account (as well as a "</div>"), which right now
prevent the PP limit report from being stripped as well.
-new_text: trim extra whitespace on the right, which is created when
stripping the aforementioned comments.

Also simplify the test for getEditVars, make it not blindly copy what
AFComputedVariable does.

Extra: kill a temporary variable.

These changes are partly taken from
I96785c6c5fdf381c21d5f8930ee12e706abb7f3f.

Change-Id: I2b4c84a3d9d0d17ce229088197b75781d5181b4f
2020-01-12 17:44:02 +00:00
jenkins-bot 8fea62529b Merge "Fix AbuseFilterCachingParser violating return type constraint" 2019-12-27 10:04:57 +00:00
jenkins-bot 5c9fe8bd9b Merge "Always evaluate the offset when retrieving array elements" 2019-12-27 09:58:50 +00:00
Daimona Eaytoy 8ad4ecd31d Always evaluate the offset when retrieving array elements
Even if the array is DUNDEFINED, we need to check the offset to ensure
that it's valid.

Bug: T237351
Change-Id: Ibfa360c4ae1d80abe14d9fdf66991b76cb5954df
2019-12-23 16:04:45 +00:00
jenkins-bot ce85c215f4 Merge "Ensure that a min/max arg count is available for all built-in functions" 2019-12-23 12:11:57 +00:00
Daimona Eaytoy b3e0529d55 Log deprecated vars in the cached phase in the new parser
For the new parser, xhgui shows that AbuseFilterParser::getVarValue is
taking up a lot of time; in turn, most of the time spent inside
getVarValue is used to log the use of deprecated variables. Hence, given
that:
 - We should keep the new parser performant
 - There are tons of deprecated variables out there and they likely
 won't be replaced
 - Having gazillions of debugLog entries doesn't help

log them only in the cached phase.

Bug: T234427
Change-Id: I2bfc692c829c3cbe889e5076f5205e2c99097087
2019-12-16 13:54:58 +01:00
Daimona Eaytoy f382304aae Add a base class for parser transition
Change-Id: I31282b8632c332b6d46a6bb4a42f57ac0d005b5f
2019-12-15 13:29:56 +00:00
Daimona Eaytoy d5ab147dcf Fix AbuseFilterCachingParser violating return type constraint
This is identical to I8a3c31e7385283d95b4712d457784016239a0b3b, except
for the array append case.

Bug: T236870
Change-Id: Iac033ba467232f6ff110d575920e968759ce0e15
2019-12-04 18:27:46 +00:00
Daimona Eaytoy 07572da2fe Really throw for too many params
Bug: T230803
Change-Id: I4e68bb7220f1151bb32b2be859f6cffc55888a30
2019-11-30 10:57:16 +00:00
Daimona Eaytoy 2ddd79fd98 Forbid assignments where the LHS is a built-in identifier
And not just a built-in variable.

Bug: T237130
Bug: T237216
Change-Id: Ie1d86dc324993efcb863be23697732e6aa1dac10
2019-11-28 14:40:38 +00:00
Daimona Eaytoy 0672a8eb8a Ensure that a min/max arg count is available for all built-in functions
This is especially useful for old patches, created before the
introduction of FUNC_ARG_COUNT, where a rebase may break the parser.

Change-Id: Ib142438626a7305f102dc3e4cc9cb07ad33902b8
2019-11-22 17:59:00 +01:00
jenkins-bot a8c50150d6 Merge "Convert static arrays to constants" 2019-11-22 13:39:39 +00:00
jenkins-bot 2d2e524dca Merge "Tokenizer: don't strip backslashes from \x" 2019-11-22 13:36:49 +00:00
Daimona Eaytoy b3e58067ac Set the utf-8 flag for var dumps in the text table
This is not retroactive; that will be handled as part of T213006.

Bug: T34478
Change-Id: I2c532da71719a9ace1279bbf67d6e6e30e9a986c
2019-11-16 16:00:45 +00:00
Daimona Eaytoy c03f0a3b08 Convert static arrays to constants
Beloved PHP7!

Change-Id: Id5170662f7c5ceacfc0ac8d90787f2c92fd93464
2019-11-16 16:32:36 +01:00
Daimona Eaytoy c73381b6db Tokenizer: don't strip backslashes from \x
Bug: T238475
Change-Id: I8c2ea6ad369946df93440eece60d456dc1a3fd7a
2019-11-16 16:21:39 +01:00
Daimona Eaytoy 98bcad25c3 Also parse numbers with the new syntax and hard-deprecate the old one
This will allow people to switch their filters to the new syntax. The
deprecation warning is now more exhaustive, and the info() warning is
kept to ensure that everything proceeds smoothly.
The regex v2 has also been fixed to:
 - Consume all the digits/letters on the right (*)
 - Have named groups
 - Be created dynamically with other constants

(*) The previous version of v2 could complete the match and leave
digits/letters on the right when encountering numbers with the old
syntax, hence dropping support too early. We also cannot use a word
boundary (\b) because that would prevent matching numbers with trailing
dots (e.g. "5.").

Bug: T212730
Change-Id: Ibf6ac571f6b5c09149d69a19c38240ce6b024dff
2019-11-12 11:52:38 +00:00
Daimona Eaytoy a77a59b962 Hard-deprecate empty operands
This bumps the level to WARN, and makes it very clear that people should
fix the affected filters. It also removes the calling method, which was
mostly meant for debugging purposes, and changes the type to 'op_type'
to avoid conflicting with type:mediawiki in logstash.

Bug: T156096
Change-Id: Ie73f1604e8ed82bc2e1be9fc90fa065be37889a3
2019-11-12 11:39:25 +00:00
jenkins-bot 91bc961712 Merge "Check for 0-like floats passed to the modulo operator" 2019-11-10 11:51:28 +00:00
Daimona Eaytoy c0f8374624 Check for 0-like floats passed to the modulo operator
That throws an error in PHP.

Bug: T237459
Change-Id: Ia0b29d6a8b9f4aac6b5b72ce8f2f45afb03f4c99
2019-11-10 11:22:04 +00:00
jenkins-bot 7ff4b95aec Merge "Expand the list of types that can be cast to int" 2019-11-10 11:00:36 +00:00
Daimona Eaytoy 585d6cdb24 Make to sure to report division by zero when the LHS is undefined
Bug: T234339
Change-Id: I1575ec013c1e7e321a8f13f40804ebc5ab076268
2019-11-08 14:08:52 +00:00
Daimona Eaytoy 1abaff1aac Better handling of keywords and functions
Always run the keyword/function handler, even if there are DUNDEFINED
arguments, so that the handler can perform further validation on the
input and report any error to the user. However, replace DUNDEFINED with
DNULL before running the handler, to avoid special-casing DUNDEFINED in
every handler. If any argument was a DUNDEFINED, we will return
DUNDEFINED anyway.

Also centralize the keyword handling logic to a new method, like it
happens for functions.

Bug: T234339
Change-Id: I875cb77418a39790e91fe5867c49917bfe406ed4
2019-11-08 15:07:20 +01:00
Daimona Eaytoy b7c7ae168d Explicitly forbid negative indexes in arrays
This emits its own error because:
1- It's clearer to understand
2- It's easier to find where we're dealing with negative offsets, if
we'll ever want to allow that.

Note that trying to use a negative index already results in a hard PHP
error being thrown.

Bug: T237219
Change-Id: Ib11eaaca5e21f740269141c75e62bac48093e8d0
2019-11-08 05:55:56 +00:00
Daimona Eaytoy a7b28369ea Expand the list of types that can be cast to int
Bug: T237624
Change-Id: I2220cb8a8ec998a433a4469d7e0591ec0b4f2b12
2019-11-07 15:14:17 +01:00
Petr Pchelko 915b9a1538 Remove usages of deprecated User methods
Bug: T220191
Change-Id: I54e20870a32ff98b41a98495694ff563c4c4c5ca
2019-10-30 12:51:01 +00:00