Andrew Garrett
95f53efdfe
Follow-up to r56296, replace htmlspecialchars_decode with html_entity_decode.
2009-09-15 10:25:15 +00:00
Brion Vibber
9bbd4f8bc9
Merge remaining unmerged line of live hacks from r53208 on AbuseFilter
2009-09-14 21:17:09 +00:00
Andrew Garrett
55c83ea218
Add HTML entity decoding to AbuseFilter ccnorm() function
2009-09-14 11:33:44 +00:00
Andrew Garrett
47d513310d
Use multibyte-safe string operations in AbuseFilter bug 19333
2009-07-31 11:26:30 +00:00
Andrew Garrett
2eafa9bd66
Bug 19604, backwards-compatibility issues with AbuseFilter count() function.
2009-07-17 16:55:31 +00:00
Andrew Garrett
5cf4cf2d5f
Fix Abuse Filter fatals. Resulted from the fact that whenever a regex error was encountered, the error handler was not reset. This error handler was then triggering for any PHP notice, E_STRICT or whatever, causing fatals on Wikimedia
2009-06-18 20:13:52 +00:00
Andrew Garrett
db3c0bbe05
Fix regex error handling by returning immediately if error reporting is disabled.
2009-06-17 11:38:31 +00:00
Andrew Garrett
6678b42d8e
Remove special-case list handling for contains_any, len, like/in -- breaks backwards-compatibility with old filters.
2009-06-16 14:28:00 +00:00
Andrew Garrett
48bfcc35ee
Various code quality fixes for AbuseFilter suggested by Tim Starling in a private email, including bugfixes, memory safeguards, performance improvements, removal of redundant code, consolidation of similar functionaality.
2009-05-26 13:08:15 +00:00
Tim Starling
da372fdec0
Reverted r49855, r49656, r49401, r49399, r49397. The language converter cannot be used outside the parser at present without generating a large number of bugs, due to global lifetime state variables, inappropriate $wgParser references, etc. Some refactoring needs to be done before it can be used in this way.
2009-05-26 07:46:29 +00:00
Tim Starling
268d72f43b
Code formatting and comments.
2009-05-22 06:42:10 +00:00
Andrew Garrett
7e70a0d197
Merge in r49312 from preferences-work -- non preference related performance improvement to the AbuseFilter parser
2009-04-23 03:37:51 +00:00
Philip Tzou
28202160b8
Add a new function named 'convert()', allow user to convert a string to specified variant in Abuse Filter. With the support of LanguageConverter which updated on r49397.
2009-04-11 10:59:38 +00:00
Victor Vasiliev
128ae5983b
Introduce list (non-associated array) support into abuse filter parser.
2009-04-05 17:11:17 +00:00
Victor Vasiliev
258d340fb5
Abuse filter:
...
* Introduce := operator for setting variables
* Throw an exception when user tries to override built-in variable
* Fix UTF-8 handling in fnmatch() fallback
* Copy three main abuse filters from enwiki to test suite
* Fix update.php integration
2009-04-05 11:47:42 +00:00
Andrew Garrett
7c2a7a2fe0
Support for variable setting with the set_var function, and multiple expressions separated by semicolons (;). In evaluation, the result of the LAST expression will be the return value.
2009-04-01 06:53:18 +00:00
Andrew Garrett
ba0b30a054
Add syntax error messages for invalid regexes
2009-04-01 05:56:24 +00:00
Andrew Garrett
3f62707206
String manipulation functions substr, str_replace and strpos for AbuseFilter
2009-04-01 05:05:23 +00:00
Andrew Garrett
c597c1915f
Add contains_any function, for searching a single haystack for multiple needles. Implemented with FSS with a fallback to a for loop, so it should be really fast.
2009-03-26 02:03:32 +00:00
Andrew Garrett
d4d2f4913d
Patch by Robert Rohde to prevent empty-string matches of a regex intended to match numbers
2009-03-26 01:30:05 +00:00
Andrew Garrett
20f8b1d16b
Properly fix regex munging
2009-03-25 12:43:53 +00:00
Andrew Garrett
1bb05bb402
Fix regex munging by not breaking with regexes with already-escaped /s
2009-03-25 12:15:28 +00:00
Andrew Garrett
5e70316a3a
Faster brace short-circuit in Abuse Filter Parser. Patch by Robert Rohde.
2009-03-25 11:48:33 +00:00
Andrew Garrett
86e4081206
Abuse Filter Parser:
...
* Efficiency -- use /A instead of PREG_OFFSET_CAPTURE and comparing offsets.
* Expand error messages to enhance debugging.
* General code quality
2009-03-25 11:36:38 +00:00
Andrew Garrett
fa2ef6a6ca
Revert half-done patch from r48802
2009-03-25 10:57:46 +00:00
Andrew Garrett
91d501a4e0
Remove OBSOLETE file for PasswordReset
2009-03-25 10:55:43 +00:00
Andrew Garrett
cf6f2899f6
Follow-up to r48674.
2009-03-22 10:34:54 +00:00
Andrew Garrett
de32554f33
Fix remote execution vulnerability (exploitable only by admins)
2009-03-22 10:31:26 +00:00
Andrew Garrett
2495c5fcf7
Optimise rmdoubles by replacing its entire code with a single regex. Benchmarking shows it's up to 20 times faster.
2009-03-22 02:39:34 +00:00
Andrew Garrett
12f62fdea4
Fix another annoying bug
2009-03-19 00:18:03 +00:00
Andrew Garrett
33a83c67a2
Some fixes for r48545
2009-03-19 00:07:29 +00:00
Andrew Garrett
e2ad3830a0
New short-circuiting of expensive operations when a boolean op means that the result won't matter
2009-03-18 23:28:35 +00:00
Andrew Garrett
1f4f45f8f2
Again revert accidentally-committed half-done code
2009-03-16 08:24:20 +00:00
Andrew Garrett
334582b645
Fix weird bug occurring in corrupted databases.
2009-03-16 08:21:24 +00:00
Andrew Garrett
a8a4d7fc5a
Revert half-done code introduced in r48372
2009-03-13 08:11:43 +00:00
Andrew Garrett
0e070fac7f
Fix problems with prevention of double warnings
2009-03-13 08:02:05 +00:00
Andrew Garrett
864a73e907
New ip_in_range function
2009-03-09 12:39:52 +00:00
Andrew Garrett
5983a65415
Change escaping handling -- make \d => \d instead of d. It helps with writing regexes.
2009-03-07 01:31:35 +00:00
Andrew Garrett
55b417f517
Add rcount function, same as count except it takes a regex as the needle
2009-03-07 01:26:42 +00:00
Andrew Garrett
e60dee6cac
Add an interface for extensions to add variables into the variable list (only for ones generated for filtering, for now). Includes an implementation in the TorBlock extension
2009-03-05 02:43:05 +00:00
Andrew Garrett
92698e95ba
Improve AbuseFilter performance by implementing lazy initialisation of computed variables.
...
This has been done by replacing simple associative arrays with an AbuseFilterVariableHolder, which recognises helper classes called AFComputedVariables.
Computation may occur during the abuse filter analysis, or later when testing and reviewing filters.
2009-02-26 12:15:14 +00:00
Andrew Garrett
05ea5b783d
Add rmwhitespace function
2009-02-18 19:42:01 +00:00
Andrew Garrett
32d676942d
Remove remnants of ctype_, and replace them with appropriate regexes (which, while slower, are locale-safe).
2009-02-11 20:01:00 +00:00
Andrew Garrett
35e61feeb6
Abuse Filter Parser updates
...
* Deprecate parseTokens in favour of a parse-as-you-go approach, faster and uses less memory.
* Display variables in lower_case so they aren't SHOUTING_AT_PEOPLE.
* Tell people if they try to use variables that don't exist, rather than silently returning NULL.
2009-02-11 20:00:33 +00:00
Andrew Garrett
0880f444b1
Abuse Filter Parser updates:
...
* Use strcspn to scan ahead for long regions of uninteresting text in string handling (performance).
* Remove cruft specific to my system in phpTest.php.
* Remove a test that was in incorrect syntax, and useless without adding variable support.
2009-02-11 18:23:21 +00:00
Andrew Garrett
bfe57be65d
Rewrite of Abuse Filter parser tokeniser.
...
I've made it more performant and fixed a few bugs by using regexes
instead of PHP loops, where possible, under the assumption that the
PCRE parser is more efficient than the same thing implemented in pure PHP.
Also, I'm now passing the same string around and calculating offsets, which
Tim tells me is far more performant than continually truncating the same string.
All tests still pass, with the exception of string.t, which I've modified
to remove the offending code, which never worked.
2009-02-11 01:41:51 +00:00
Andrew Garrett
430c95a60d
Make variable names and keywords case-insensitive.
2009-01-30 23:46:25 +00:00
Andrew Garrett
48748d8fa7
Fix use of instance methods in nextToken, which is a static method.
2009-01-27 04:09:53 +00:00
Andrew Garrett
11ab345814
Localise Abuse Filter exceptions.
2009-01-26 23:32:46 +00:00
Andrew Garrett
d50a26f04d
Explicit detection for division by zero.
2009-01-25 05:54:49 +00:00