Commit graph

102 commits

Author SHA1 Message Date
Sam Reed 5df9b1cc11 Remove some more unused globals
Kill a couple of other unused variables
2010-07-25 17:12:50 +00:00
Chad Horohoe e3978dc584 Get rid of the last (I think) php4-style calls to wfGetDB() 2010-02-13 23:03:40 +00:00
Siebrand Mazeland e26cb735b1 (bug 21387) Make $ regex work for the URLs. Patch contributed by Platonides.
Bug comment: Set PCRE_MULTILINE on spamblacklist regexes. $ on spam blacklist regex should match the end of the url (not of the text) so it can be used to match only the mainpage. Since the candidate urls are already joined with a new-line separator, it's just setting PCRE_MULTILINE on the regex.
2010-01-09 18:43:34 +00:00
Chad Horohoe 2e1c0ed6d9 Remove getHttp() method and just call Http::get() directly. 2009-05-18 00:48:07 +00:00
Chad Horohoe b5f59b66c1 Kill a few unused $wgArticles. 2009-04-28 00:51:53 +00:00
Nicolas Dumazet ebfcc6bb37 Adding "ignoreEditSummary" configuration possibility, per http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/41993 2009-02-24 04:42:58 +00:00
Siebrand Mazeland 0ee9969f91 Replace "\n" in 'spam-invalid-lines' by hard coded "<br />". 2009-01-25 20:28:24 +00:00
Siebrand Mazeland fbbdf814d4 (bug 16120) Prevent death on Spam Blacklist trigger using API. Patch by Brad Jorsch.
An API edit attempt with Spam Blacklist firing will now output something instead of crashing:

<?xml version="1.0"?><api><edit spamblacklist="http://blacklistme.example.com"
result="Failure" /></api>
2008-11-02 22:40:02 +00:00
Aaron Schulz 51ebe6aab0 Fix mixed up params 2008-08-16 21:40:30 +00:00
Brion Vibber 2ca85c4af2 * (bug 15099) Bad regexes make the at least some of the blacklist get ignored
Lines with "\" at the end would silently break both that and the following line in the batch, without triggering the overall parse errors.
Added a specific check for this case to skip the bad lines when building, and to check for them and report a warning during editing.
2008-08-11 01:32:38 +00:00
Brion Vibber d1cea7f463 Partial revert of r37016 (attempt to swap out old compat wfGetHTTP() with Http::get())
function_exists( 'Http::get' ) always returns false, since that's not a function name.
If the current state of the extension only works on modern versions of MW, then just call Http::get() without checking. If it still works on versions that predate it, it's best to keep the existing compat code probably.
2008-07-04 20:55:28 +00:00
Chad Horohoe 328c07db5f Use Http::get() rather than wfGetHttp() [which is otherwise unused and could stand removal) 2008-07-04 02:22:09 +00:00
Brion Vibber 0d1234fa65 apply live hacks: debug logging for spam regex & blacklist hits 2008-06-19 23:33:45 +00:00
Chad Horohoe 0909094e01 And now SpamBlacklist checks the edit summary field. 2008-06-19 03:14:34 +00:00
Aaron Schulz f6ecf15e42 Fix E_SCRICT errors and pass-by-ref error. Blank pages getting thrown around. 2008-05-14 12:44:34 +00:00
Brion Vibber 45a4c9f03b * (bug 1505) Limit spam blacklist checks to new URLs to reduce disruption of existing pages being legitimately edited by legitimate people which happen to already have some spam on them.
Steals the load-existing-links function out of ConfirmEdit.
2008-05-13 23:31:33 +00:00
Victor Vasiliev cc46cfa8bf * (bug 12896) A way to bypass Spam Blacklist 2008-02-03 18:58:27 +00:00
Siebrand Mazeland 360b0070bc (bug 12608) Unifying the spelling of getDBkey() in the extension code. 2008-01-14 10:09:08 +00:00
Tim Starling 1f195dbc54 * Optimised startup
* Use the new EditFilterMerged hook if available, for faster link finding
* Random bits of code were leaking out of the body file into the loader, poked them back in.
2007-11-12 07:44:17 +00:00
Brion Vibber f5b96bbd09 * (bug 11545) Don't let everything through if there's a bogus whitelist entry 2007-10-03 00:48:57 +00:00
Brion Vibber cc1ddc1162 Break spam blacklist log info out to a sep file 2007-10-03 00:19:36 +00:00
Brion Vibber ee8ba1d25f * (bug 11129) Hit spam blacklist on https: links 2007-09-11 18:13:40 +00:00
Brion Vibber 1783cc7627 suppress warnings 2007-08-08 15:42:36 +00:00
Brion Vibber bde084c272 Some polishing and refactoring on this monstrosity, it's been allowed to grow without some good snipping in a while. :)
* Handle bad regexes more gracefully:
 - The batched regexes are tested for validity, and if one is bad, the lines from that source are broken out line-by-line. This is slower, but the other lines in that source will still be applied correctly.
 - Suppress warnings and be more verbose in the debug log.
 - Check for bad regexes when a local blacklist page is edited, and prompt the user to fix the bad lines.
* Caching issues:
 - Cache the full regexes per-DB instead of per-site; this should be friendlier to shared environments where not every wiki has the same configuration.
 - Hopefully improve the recaching of local pages, which looked like it would preemptively apply the being-edited text to the cache during the filter callback, even though something else might stop the page from being saved. Now just clearing the cache after save is complete, letting it re-load later.
* Split out some of the regex batch functions for clarity.

There are probably still issues with caching of HTTP bits, and in general the local DB loading looks verrrry fragile.
Test this a bit more before syncing. :)
2007-07-20 21:13:26 +00:00
Brion Vibber 72ca079b97 Add a local blacklist at MediaWiki:Spam-blacklist which can always be used, just as the local whitelist at MediaWiki:Spam-whitelist.
Should save some trouble for annoyed people. :)
The regular message cache behavior is used for this message, so it'll also update immediately, without waiting for the shared caches to time out.
Additionally, added a fix for configurations which don't hardcode the PHP include_path by using $IP in an include for HttpFunctions.php.
2007-07-07 17:21:49 +00:00
Aryeh Gregor 740736ecd9 Extensions too! 2007-06-29 01:36:09 +00:00
Brion Vibber 8285ddcd0f * (bug 8375) Reduce spamblacklist's regex size quite a bit; the actual limit seems very hard to predict and may vary based on version, os, architecture, or phase of the moon. Now breaking at 4096 bytes rather than the previous 20000; this makes 12 regexes for the current Wikimedia set. 2007-01-13 05:24:09 +00:00
Antoine Musso 55fbcdc9a6 remove some ending whitespaces 2007-01-06 20:56:46 +00:00
Brion Vibber 9bb2bc11fa Split giant regexes so PCRE stops screaming about them.
Haven't tested cleanup.php
2006-09-18 09:56:57 +00:00
Brion Vibber 18bd5bf9ef Apply pre-save transform for more thorough checks 2006-06-22 21:12:18 +00:00
Brion Vibber 5eb474a2f7 Run text through the parser and get the actual links recorded instead of trying to second-guess behavior 2006-06-22 20:35:49 +00:00
Brion Vibber 9036c0242b Add a local whitelist, editable by admins at [[MediaWiki:Spam-whitelist]] 2006-06-22 19:59:43 +00:00
Antoine Musso c92ee8cc03 allow '-' in database name 2006-05-21 11:04:56 +00:00
Rob Church 641a3f7bee (reopened bug 5185) Match on two or more slashes on the protocol to prevent another blacklist workaround 2006-04-28 23:18:47 +00:00
Rob Church 992a1ac684 (bug 5185) Strip out SGML comments before scanning the text for matches so some nutter can't circumvent the lot with a well placed <!-- --> 2006-04-12 04:59:27 +00:00
Tim Starling f3219927ae Updated DB: for the 1.5 schema, fixed a few bugs 2006-01-23 01:35:39 +00:00
Tim Starling 233eeb2262 some tweaks 2006-01-21 23:27:39 +00:00
Tim Starling 05a1bf5f1f split the regex fetching part of the filter into its own function 2006-01-19 17:14:10 +00:00
Tim Starling 25eaa74056 fixed blank line at end of file 2006-01-19 07:24:25 +00:00
Brion Vibber 0e53200a91 * (bug 3934) Check _ in hostname prefixes; it's illegal but seems to be accepted by browsers 2005-11-16 09:56:13 +00:00
Tim Starling 76e139b595 bug #2598: only one blacklist file is parsed by SpamBlacklist extension 2005-11-01 07:25:33 +00:00
Tim Starling f9b43ca259 fixed empty regex check 2005-07-13 22:41:14 +00:00
Tim Starling cd68beb218 More configuration settings, fixed URL 2005-07-08 16:29:22 +00:00
Tim Starling 0c5c457080 forgot to commit this 2005-07-02 09:03:53 +00:00
Tim Starling 31e30af1b2 Support for HTTP, including working default, to load text from meta once per hour. Special attention paid to reducing load on meta, of course. 2005-06-25 15:49:21 +00:00
Tim Starling 3aaededb3b and title 2005-03-09 14:20:35 +00:00
Tim Starling 588955efbd fixed DB name and table name 2005-03-09 14:17:22 +00:00
River Tarnell e6d30146eb and here 2005-02-20 09:09:29 +00:00
River Tarnell 7d68ef6e01 preg_match keys start at 1, not 0 2005-02-20 09:03:13 +00:00
River Tarnell 8d531d0726 discard backslashes prior to slash 2005-02-20 08:02:18 +00:00
Tim Starling 6d61a06c88 typo 2004-12-11 11:12:00 +00:00
Tim Starling 7b9d0425d5 from phase3/extensions 2004-12-11 09:59:06 +00:00