wikimedia/mediawiki-extensions-AbuseFilter

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/AbuseFilter.git synced 2024-11-24 22:15:26 +00:00

Author	SHA1	Message	Date
jenkins-bot	25d63aa639	Merge "Add new number syntax as experimental"	2019-08-31 17:12:10 +00:00
Daimona Eaytoy	d51ca862c6	Move parser tests to /unit IMHO these can be considered unit tests; they were already fast, but now they're executed in an instant. This requires several changes: 1 - delay retrieving messages in AFPUserVisibleException, to avoid having to deal with i18n whenever we want to test exceptions; 2 - Use some DI for Parser and Tokenizer. Equivset-dependend tests are also moved to a new class, thus helping to fix the AF part of T189560. Change-Id: If4585bf9bb696857005cf40a0d6985c36ac7e7a8	2019-08-28 16:36:37 +00:00
Daimona Eaytoy	71730f7d44	Warn if a function has been given too many parameters While this is not as important as throwing for too few parameters, IMHO it's still important to fail in this case. Mostly because if a function receives too many parameters, chances are that who wrote the filter didn't do that intendedly, and thus there may be a hidden bug. Bonus: fix a few docblocks. Bug: T230803 Change-Id: Iac2931f17b50ace8c8f4c2faa44b3f54ca134c54	2019-08-26 20:29:49 +02:00
Daimona Eaytoy	4d86758a49	Add new number syntax as experimental For now it will only report successful parse. Next step is formally deprecating the old one (escalated to warning), then removing it in favour of the new one (in another MW version). Bug: T212730 Change-Id: I5dd11fd67d8e57d1d0c52ddfa026920ebfc5ee13	2019-08-26 08:15:55 +00:00
jenkins-bot	ff2f6ee26f	Merge "Add a new class for the CachingParser's AST"	2019-08-25 18:00:24 +00:00
Daimona Eaytoy	d515af0ae6	Add a new class for the CachingParser's AST This allows a little bit more of abstraction: we can store other data in the tree, without having to store it in a specific node (e.g. the variables map, which is still unused). It also adds a few typehints, and specializes the return value of eval'ing the AST: previously, it was the one of evalNode, which wasn't guaranteed to be an AFPData. Now we have this guarantee. Last but not least, we can now measure runtime metrics for evalTree, which doesn't recurse. Bonus: fix a check in the old parser, which used the wrong variable when reporting outofbounds errors. Change-Id: Iff806793b1d968e9bb6220f1459f3d0ac587c7da	2019-08-25 17:29:16 +00:00
jenkins-bot	6196801178	Merge "Log more empty operands"	2019-08-24 20:53:01 +00:00
Daimona Eaytoy	2d031d0bee	Log more empty operands And fix a couple of minor bugs. Bug: T156096 Depends-On: I3b85087677607573f4fa68681735dc35348dcd87 Change-Id: Ia4c713a1d45827f6a8bc5566a8d8835c49f8108a	2019-08-24 19:59:53 +00:00
jenkins-bot	47838715fa	Merge "Allow if without else"	2019-08-20 20:12:19 +00:00
jenkins-bot	5e605aaa62	Merge "Even better handling of DUNDEFINED"	2019-08-20 20:00:52 +00:00
jenkins-bot	bf8ccccade	Merge "Fix a bug in the return value of the CachingParser"	2019-08-20 19:58:38 +00:00
Daimona Eaytoy	af7744781f	Allow if without else Bug: T230727 Depends-On: I8e7f7710b8cb37ada8531b631456a3ce7b27ee45 Change-Id: I3b85087677607573f4fa68681735dc35348dcd87	2019-08-20 19:36:14 +00:00
Daimona Eaytoy	963221ad6d	Even better handling of DUNDEFINED Ensure that the variable isn't set before marking it as DUNDEFINED: that's only for when we cannot use a default, but if the variable is set we already have one. Most notably, this fixes conditionals handling: right now, if you have a conditional with an assignment in both branches, the variable will be undefined. That's obviously wrong, so it's fixed in this patch. Plus: catch only AFPExceptions in a test to avoid unintentionally catching the assert exception; simplify some assignments using wfSetVar. Depends-On: I446a307e5395ea8cc8ec5ca5d5390b074bea2f24 Change-Id: I8e7f7710b8cb37ada8531b631456a3ce7b27ee45	2019-08-20 19:17:30 +00:00
Daimona Eaytoy	fa76405ea7	Fix a bug in the return value of the CachingParser This has always been wrong, and remained unnoticed. Also added a typehint for added safety. Change-Id: I8a3c31e7385283d95b4712d457784016239a0b3b	2019-08-20 20:54:19 +02:00
Daimona Eaytoy	aa867bd370	Better handling of function params in CachingParser This patch includes various fixes to how func arguments are handled in CachingParser: - Add a comment about a future improvement of checkSyntax, which we could limit to try building the AST. - Having enough args for each function is now also checked when building the AST. This allows implementing the previous point without stopping to report notenoughargs at syntaxcheck-time (otherwise it'd be a runtime error). And it also ensure that we check for the params count inside skipped branches, e.g. inside if/else: these were already only discovered at runtime in CachingParser. The old parser is not affected by this change, because when checking syntax it will always execute all branches, and at runtime it will skip braces altogether. - Fix arg count for CachingParser, which previously added a bogus param in case of a function called without parameters. This was fixed for the other parser in I484fe2994292970276150d2e417801453339e540, and I just ported the updated fix. Also note that the CachingParser was already failing for e.g. `count()`, but instead of complaining about missing arguments, it failed hard when trying to pass NULL to evalNode. - Fixed some tests not to use setExpectedException, which caused the previous point to remain unnoticed: calling that method prevents the loop from continuing, and thus only the AbuseFilterParser part was being executed. The new implementation checks the exception ID and is thus more future-proof if the i18n message changes. - Fixed some function names in error reporting for the old parser. - The arg count is now checked outside of the function handlers, thus it's no more necessary to call checkEnoughArguments at the beginning of each handler. This also produces clearer error messages in case of aliases (e.g. set/set_var). - Check the args count even if some of the args are DUNDEFINED. This is much easier now that the check is outside of the handler. This will make syntax check fail for e.g. `contains_any(added_lines)`. Bug: T156095 Change-Id: I446a307e5395ea8cc8ec5ca5d5390b074bea2f24	2019-08-20 15:32:02 +00:00
jenkins-bot	7addec7b4a	Merge "Make some other AFPData methods non-static"	2019-08-20 14:16:16 +00:00
jenkins-bot	1f45336157	Merge "Move keywords handlers to the Parser"	2019-08-20 14:16:10 +00:00
jenkins-bot	f18d0814e2	Merge "Make several AFPData functions non-static"	2019-08-20 14:06:02 +00:00
jenkins-bot	f1ab591d27	Merge "Avoid implicit casts from DUNDEFINED to something else"	2019-08-20 13:04:48 +00:00
jenkins-bot	ea01809f5e	Merge "Add the filter ID to empty operand logging"	2019-08-20 13:01:14 +00:00
jenkins-bot	d32b03ca10	Merge "Increase cache hits for CachingParser"	2019-08-20 12:50:31 +00:00
jenkins-bot	d0b30c2534	Merge "Make parser aware of the filter it is parsing"	2019-08-20 12:50:26 +00:00
Daimona Eaytoy	d715f6d2c0	Increase cache hits for CachingParser If $parser->parse returns a falsey value (=null), that's because the filter doesn't have any statement. But that's not a valid reason not to cache the filter. Hence, return whatever parse() is returning inside the callback, so that the result is always cached. Change-Id: Ib6b0e72d882dc484456a3be6bbc74da36ef48bf7	2019-08-13 18:03:13 +02:00
Daimona Eaytoy	d58b5930f8	Add the filter ID to empty operand logging To make debugging a lot easier. Bug: T156096 Bug: T153251 Change-Id: I1f905c6e1a524a745240b05709ef9d1dfc3c23a1	2019-08-13 15:22:55 +00:00
Daimona Eaytoy	1197eb6b41	Make parser aware of the filter it is parsing This information will mostly be used for debugging purposes. Change-Id: Ia1bcc2acc22aba97d855382b5b173ac3d5f2c54b	2019-08-13 15:22:38 +00:00
Daimona Eaytoy	430ba818d0	Add test for multiple conditions inside conditionals The regression itself was fixed in I980aec3481a52ecc35f1811a366014a5581a7cdb, so this patch only adds a test for it. Also remove a comment about CachingParser failures: we don't want to encourage people to remove it from tests anymore. Bug: T152281 Change-Id: I3ad49050ea49bf45d3226878e091da3c8dbefdb1	2019-08-12 18:18:05 +02:00
Daimona Eaytoy	4b0911ee01	Make some other AFPData methods non-static Change-Id: I22ea337a36f911c57d3dadb9a3c45fc2c8b7c628	2019-08-12 14:40:51 +02:00
Daimona Eaytoy	3f171dc0a5	Move keywords handlers to the Parser Just like we do for functions, it doesn't really make sense to have keywords separately, in AFPData. Change-Id: I208a9b1ce2bd12038e9fbcc515c48d604ec80eb8	2019-08-12 14:29:56 +02:00
Daimona Eaytoy	2fdf091eb9	Make several AFPData functions non-static The keywords-related ones will be handled in a subsequent patch. Change-Id: Ifcfad438023ef136dc6f2cd5529e867df9b23789	2019-08-12 14:12:16 +02:00
Daimona Eaytoy	1fe3647268	Avoid implicit casts from DUNDEFINED to something else This patch keeps the current behaviour for everything (since DUNDEFINED was always casted to boolean false), but handles the cast at a higher level instead of relying on what AFPData::castTypes will do. This way it's easier to spot places where we may get DUNDEFINED, and decide how to handle them one by one. Change-Id: I1070e15ea03c7dd4a4231b87afbc42240a558581	2019-08-12 11:18:15 +02:00
Daimona Eaytoy	69ad23da98	Ban variable variables As explained on phab, it's not worth the effort of keeping this feature. Bug: T229947 Change-Id: Ic6067cab8e1ede98545e704888c99e2ed9a004e4	2019-08-11 01:47:35 +00:00
Daimona Eaytoy	2ed6272bb2	Partly handle set and set_var in shortcircuit This is more complicated than the := operator, because the var name could be a complicated expression, and we have to handle a function call. This patch only covers the case where the variable name is a literal, which is enough for WMF production. Bug: T214674 Change-Id: I6c0f8e95663919a0235b5ccf0c88ad0a539315a7	2019-08-06 16:14:34 +02:00
Daimona Eaytoy	afeff7c222	Really avoid DEMPTY leak Follow-up I7831f3ed9f7c0656e0e8f77ded049c20eca682ba, really avoid the leak. My addition was pointless because we need DUNDEFINED, not DEMPTY, and I spent way too much time trying to understand what was still wrong. Still have to get used to these new names... Change-Id: I332967f6fb00b67fd355547b19638c95ffa5bba7	2019-08-04 22:02:13 +00:00
Daimona Eaytoy	f977e858ab	Avoid DEMPTY leak As shown in the coverage reports [0], some empty operand logging lines are covered, but no test should have empty operands. I see one of the cause is skipOverBraces keeping $result as is, even if DEMPTY, so turn it into a DUNDEFINED. [0] - https://doc.wikimedia.org/cover-extensions/AbuseFilter/includes/parser/AbuseFilterParser.php.html Change-Id: I7831f3ed9f7c0656e0e8f77ded049c20eca682ba	2019-08-04 18:25:04 +00:00
jenkins-bot	e733872e13	Merge "Allow accessing offsets of built-in variables"	2019-08-04 17:48:46 +00:00
jenkins-bot	790cd38fb0	Merge "Further deprecation for empty conditions"	2019-08-04 17:26:40 +00:00
Daimona Eaytoy	517919fca8	Allow accessing offsets of built-in variables I5ec4ab44c4e88aaf18c0d7b73355d27050beeda7 almost fixed this bug, but we also have to make it possible to access builtin variables as arrays. This will only make sense for a few variables (e.g. added_lines and removed_lines), but I don't think we should validate it when checking syntax. Bug: T198531 Change-Id: I417e1b8d4802bbfccd091ce5c7617659cfd1e4ea	2019-08-04 17:14:44 +00:00
jenkins-bot	d940ef63cd	Merge "Specialize empty AFPData types"	2019-08-04 15:52:34 +00:00
Daimona Eaytoy	5f4491f9aa	Further deprecation for empty conditions Start deprecating "empty" logic operators, and now that we have DEMPTY, simplify handling of empty function arguments introduced in Ica3e49f5b00595a95513d9683732e490aa7aae17. Bug: T156096 Change-Id: Ied6b385e8690b6cc6e69afcf614389f737ab95bd	2019-08-04 15:33:49 +00:00
Daimona Eaytoy	9049be3609	Specialize empty AFPData types As described in T156096#5389655. Change-Id: Ifbf95a6b72a280cd77db6affbd8d642499bbfedc	2019-08-04 15:26:57 +00:00
Daimona Eaytoy	33bfe97d8c	Move non-decimal numbers deprecation logging Bug: T212730 Change-Id: Idb833c60541873bfe9c2b225009bd32e4a48cd60	2019-08-03 16:57:24 +00:00
Daimona Eaytoy	a85e1ccc59	Make AbuseFilterParser::$funcCache non-static Change-Id: I312efe3ce4d1f06e697aa4564aeec1bacbaf97d3	2019-08-03 09:19:49 +00:00
Daimona Eaytoy	09d0254172	Better handling of DNONE This patch includes: * Making it possible to access offsets of a DNONE (returning a DNONE) * Initializing user-defined variables as DNONE inside short-circuited branches * Make DNONE propagate with other operators * Make DNONE count as false for logic operators * Remove a now-outaded bit in doLevelAtom. In case of shortcircuit, $result is now DNONE instead of DNULL, and thus it's possible to access offsets of it. Performance++! * Don't allow modifying or adding an element of a DNONE as if it were an array (to avoid inconsistencies) This re-applies Id85c673337fa90a3782fd22eb9690cd996967111 with several fixes. NOTE: Haven't tested locally, although I'm pretty confident thanks to the amount of tests added. Bug: T214674 Bug: T228677 Change-Id: I5ec4ab44c4e88aaf18c0d7b73355d27050beeda7	2019-08-02 21:05:08 +00:00
jenkins-bot	e3e157361d	Merge "Revert "Initialize user-defined variables during shortcircuit""	2019-07-29 23:30:50 +00:00
Daimona Eaytoy	13cdb86dd2	Revert "Initialize user-defined variables during shortcircuit" Reason for revert: T214674#5374806 This reverts commit `56e6117afd`. Bug: T214674 Change-Id: Iccce248d2693cd9877a740b74e72a577e730435e	2019-07-29 23:06:23 +00:00
Daimona Eaytoy	4720c97530	Add a new class for methods related to running filters Currently we strongly abuse (pardon the pun) the AbuseFilter class: its purpose should be to hold static functions intended as generic utility functions (e.g. to format messages, determine whether a filter is global etc.), but we actually use it for all methods related to running filters. This patch creates a new class, AbuseFilterRunner, containing all such methods, which have been made non-static. This leads to several improvements (also for related methods and the parser), and opens the way to further improve the code. Aside from making the code prettier, less global and easier to test, this patch could also produce a performance improvement, although I don't have tools to measure that. Also note that many public methods have been removed, and almost any of them has been made protected; a couple of them (the ones used from outside) are left for back-compat, and will be removed in the future. Change-Id: I2eab2e50356eeb5224446ee2d0df9c787ae95b80	2019-07-23 19:06:27 +00:00
Daimona Eaytoy	56e6117afd	Initialize user-defined variables during shortcircuit Bug: T214674 Depends-On: I5a14d4b2bc3ffd9caaaa095f16f36b9b6009db05 Change-Id: Id85c673337fa90a3782fd22eb9690cd996967111	2019-07-23 12:20:53 +00:00
Daimona Eaytoy	18d7d2ed62	Start using AFPData::DNONE This should allow more flexibility when checking syntax, and a saner behaviour overall. Aside from not throwing exception in certain cases, the results should be almost equal to the ones you would get without this patch. However, there are still a few things to improve (which for convenience I wrote inside the parser test) and many to test. Bug: T204654 Depends-On: I69bfec45c76509fb1112641393f78e8d8834adcd Change-Id: I5a14d4b2bc3ffd9caaaa095f16f36b9b6009db05	2019-07-14 08:48:47 +00:00
Daimona Eaytoy	7bc566e635	Fix the regex for numbers, start deprecation of non-decimal numbers Aside from the 14 thingy reported in the task, this syntax is awful! The fix to the regex should only be intended as a temporary stopgap. A proper fix would be to introduce a new syntax, like for instance the one used in PHP. Bug: T212726 Change-Id: Idc37a17ce539e6c63d67fc07d47d812569debe0e	2019-07-10 13:26:36 +00:00
jenkins-bot	c3dcd95733	Merge "Start making APFData members private"	2019-07-09 09:23:17 +00:00

1 2 3 4

164 commits