Commit graph

68 commits

Author SHA1 Message Date
Siebrand Mazeland e9874344aa Maintenance for SpamBlacklist extension.
* Replace deprecated methods. MediaWiki 1.19 required.
* Replace <tt> with <code>.
* Update documentation.
* Use WikiPage instead of Article for doEdit().
* Use __DIR__ instead of dirname( __FILE__ ).
* Remove superfluous newlines.

Change-Id: I3a0e42ca404638f7c7934c316735ad11cbc99d42
2012-09-03 16:50:18 +02:00
Platonides 85583cd4f4 (Bug 35023) The spam blacklist doesn't act on protocol-relative links.
Change-Id: Ibe15cdf62d0099f10fb73f56ce0dfee2abac7f35
2012-07-14 19:41:29 +02:00
jarry1250 20058848ab Other half of fix for bug #30332 ("API spamblocklist error should
provide all blocked URLs").

SpamBlacklist extension to provide all matched URLs to
spamPageWithContent() rather than just one. Performance
hit negligible and zero for all edits that don't hit the
SpamBlacklist (99.999%+).

DEPENDENT ON OTHER HALF OF FIX (now in core):
https://gerrit.wikimedia.org/r/3740

Change-Id: Ia951d5795c5cedb6c3876be89f8a08f110004102
2012-03-27 21:42:49 +01:00
Sam Reed 856be3bc29 Bug 35156 - Harmonise spelling of getArticleID() and getArticleId()
Mass change ->getArticleId() to ->getArticleID()
2012-03-11 19:04:37 +00:00
Sam Reed 8468534f8e Manually apply r110682 to trunk 2012-02-03 20:15:02 +00:00
John Du Hart aaf4d74d18 Adding Email blacklisting to the SpamBlacklist extension
This relies on r109111
2012-01-18 23:29:37 +00:00
John Du Hart 62b2bde146 Refactored SpamBlacklist to be extendable for other blacklist types
This is the groundwork for Bug 33761
2012-01-17 06:13:46 +00:00
Tim Starling 220ac94681 Match protocol-relative URLs. Patch by Anaconda. 2012-01-02 23:58:18 +00:00
Roan Kattouw 769640ee5c Fix misspelled constant in r95663 2011-08-31 19:07:44 +00:00
Roan Kattouw b98c200706 Last commit to make WMF-deployed extensions HTTPS-ready (hopefully): use wfExpandUrl() in a bunch of places
* SpamBlacklist: code is weird but I'm pretty sure this needs HTTP
* ContributionTracking: expand return URL to current protocol. Use HTTP in the test suite (PROTO_CURRENT makes no sense in tests since they run from the command line)
* GlobalUsage: remove URL expansion, not needed after r95651
* CentralNotice: expand URL because it gets fed to window.location indirectly via JS
* OpenSearhXml: use canonical URLs in XML output
* MobileFrontend: expand a URL that's used in a Location: header
2011-08-29 14:37:47 +00:00
Alexandre Emsenhuber b16bb18e5a Dropped pre-1.12 compatibility code 2011-05-27 19:26:00 +00:00
Sam Reed d6131ea82d Kill/update callers for some deprecated code 2011-05-06 23:52:52 +00:00
Mark A. Hershberger 9cc1d19d23 PLEASE TEST: Bug #26332 — Patch that I think should fix the problem
according to the comments, but needs more testing

* Also, a one line w/s fix up
2011-05-03 20:23:35 +00:00
Sam Reed b3de09a381 More undefined variables 2011-01-23 10:34:56 +00:00
Sam Reed 7e97019b2e Conditionals in loops to foreachs 2010-10-29 21:30:20 +00:00
Chad Horohoe e86cfdacb4 More php4-style constructors. I think thats most of them 2010-08-30 17:11:45 +00:00
Sam Reed 5df9b1cc11 Remove some more unused globals
Kill a couple of other unused variables
2010-07-25 17:12:50 +00:00
Chad Horohoe e3978dc584 Get rid of the last (I think) php4-style calls to wfGetDB() 2010-02-13 23:03:40 +00:00
Siebrand Mazeland e26cb735b1 (bug 21387) Make $ regex work for the URLs. Patch contributed by Platonides.
Bug comment: Set PCRE_MULTILINE on spamblacklist regexes. $ on spam blacklist regex should match the end of the url (not of the text) so it can be used to match only the mainpage. Since the candidate urls are already joined with a new-line separator, it's just setting PCRE_MULTILINE on the regex.
2010-01-09 18:43:34 +00:00
Chad Horohoe 2e1c0ed6d9 Remove getHttp() method and just call Http::get() directly. 2009-05-18 00:48:07 +00:00
Chad Horohoe b5f59b66c1 Kill a few unused $wgArticles. 2009-04-28 00:51:53 +00:00
Nicolas Dumazet ebfcc6bb37 Adding "ignoreEditSummary" configuration possibility, per http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/41993 2009-02-24 04:42:58 +00:00
Siebrand Mazeland 0ee9969f91 Replace "\n" in 'spam-invalid-lines' by hard coded "<br />". 2009-01-25 20:28:24 +00:00
Siebrand Mazeland fbbdf814d4 (bug 16120) Prevent death on Spam Blacklist trigger using API. Patch by Brad Jorsch.
An API edit attempt with Spam Blacklist firing will now output something instead of crashing:

<?xml version="1.0"?><api><edit spamblacklist="http://blacklistme.example.com"
result="Failure" /></api>
2008-11-02 22:40:02 +00:00
Aaron Schulz 51ebe6aab0 Fix mixed up params 2008-08-16 21:40:30 +00:00
Brion Vibber 2ca85c4af2 * (bug 15099) Bad regexes make the at least some of the blacklist get ignored
Lines with "\" at the end would silently break both that and the following line in the batch, without triggering the overall parse errors.
Added a specific check for this case to skip the bad lines when building, and to check for them and report a warning during editing.
2008-08-11 01:32:38 +00:00
Brion Vibber d1cea7f463 Partial revert of r37016 (attempt to swap out old compat wfGetHTTP() with Http::get())
function_exists( 'Http::get' ) always returns false, since that's not a function name.
If the current state of the extension only works on modern versions of MW, then just call Http::get() without checking. If it still works on versions that predate it, it's best to keep the existing compat code probably.
2008-07-04 20:55:28 +00:00
Chad Horohoe 328c07db5f Use Http::get() rather than wfGetHttp() [which is otherwise unused and could stand removal) 2008-07-04 02:22:09 +00:00
Brion Vibber 0d1234fa65 apply live hacks: debug logging for spam regex & blacklist hits 2008-06-19 23:33:45 +00:00
Chad Horohoe 0909094e01 And now SpamBlacklist checks the edit summary field. 2008-06-19 03:14:34 +00:00
Aaron Schulz f6ecf15e42 Fix E_SCRICT errors and pass-by-ref error. Blank pages getting thrown around. 2008-05-14 12:44:34 +00:00
Brion Vibber 45a4c9f03b * (bug 1505) Limit spam blacklist checks to new URLs to reduce disruption of existing pages being legitimately edited by legitimate people which happen to already have some spam on them.
Steals the load-existing-links function out of ConfirmEdit.
2008-05-13 23:31:33 +00:00
Victor Vasiliev cc46cfa8bf * (bug 12896) A way to bypass Spam Blacklist 2008-02-03 18:58:27 +00:00
Siebrand Mazeland 360b0070bc (bug 12608) Unifying the spelling of getDBkey() in the extension code. 2008-01-14 10:09:08 +00:00
Tim Starling 1f195dbc54 * Optimised startup
* Use the new EditFilterMerged hook if available, for faster link finding
* Random bits of code were leaking out of the body file into the loader, poked them back in.
2007-11-12 07:44:17 +00:00
Brion Vibber f5b96bbd09 * (bug 11545) Don't let everything through if there's a bogus whitelist entry 2007-10-03 00:48:57 +00:00
Brion Vibber cc1ddc1162 Break spam blacklist log info out to a sep file 2007-10-03 00:19:36 +00:00
Brion Vibber ee8ba1d25f * (bug 11129) Hit spam blacklist on https: links 2007-09-11 18:13:40 +00:00
Brion Vibber 1783cc7627 suppress warnings 2007-08-08 15:42:36 +00:00
Brion Vibber bde084c272 Some polishing and refactoring on this monstrosity, it's been allowed to grow without some good snipping in a while. :)
* Handle bad regexes more gracefully:
 - The batched regexes are tested for validity, and if one is bad, the lines from that source are broken out line-by-line. This is slower, but the other lines in that source will still be applied correctly.
 - Suppress warnings and be more verbose in the debug log.
 - Check for bad regexes when a local blacklist page is edited, and prompt the user to fix the bad lines.
* Caching issues:
 - Cache the full regexes per-DB instead of per-site; this should be friendlier to shared environments where not every wiki has the same configuration.
 - Hopefully improve the recaching of local pages, which looked like it would preemptively apply the being-edited text to the cache during the filter callback, even though something else might stop the page from being saved. Now just clearing the cache after save is complete, letting it re-load later.
* Split out some of the regex batch functions for clarity.

There are probably still issues with caching of HTTP bits, and in general the local DB loading looks verrrry fragile.
Test this a bit more before syncing. :)
2007-07-20 21:13:26 +00:00
Brion Vibber 72ca079b97 Add a local blacklist at MediaWiki:Spam-blacklist which can always be used, just as the local whitelist at MediaWiki:Spam-whitelist.
Should save some trouble for annoyed people. :)
The regular message cache behavior is used for this message, so it'll also update immediately, without waiting for the shared caches to time out.
Additionally, added a fix for configurations which don't hardcode the PHP include_path by using $IP in an include for HttpFunctions.php.
2007-07-07 17:21:49 +00:00
Aryeh Gregor 740736ecd9 Extensions too! 2007-06-29 01:36:09 +00:00
Brion Vibber 8285ddcd0f * (bug 8375) Reduce spamblacklist's regex size quite a bit; the actual limit seems very hard to predict and may vary based on version, os, architecture, or phase of the moon. Now breaking at 4096 bytes rather than the previous 20000; this makes 12 regexes for the current Wikimedia set. 2007-01-13 05:24:09 +00:00
Antoine Musso 55fbcdc9a6 remove some ending whitespaces 2007-01-06 20:56:46 +00:00
Brion Vibber 9bb2bc11fa Split giant regexes so PCRE stops screaming about them.
Haven't tested cleanup.php
2006-09-18 09:56:57 +00:00
Brion Vibber 18bd5bf9ef Apply pre-save transform for more thorough checks 2006-06-22 21:12:18 +00:00
Brion Vibber 5eb474a2f7 Run text through the parser and get the actual links recorded instead of trying to second-guess behavior 2006-06-22 20:35:49 +00:00
Brion Vibber 9036c0242b Add a local whitelist, editable by admins at [[MediaWiki:Spam-whitelist]] 2006-06-22 19:59:43 +00:00
Antoine Musso c92ee8cc03 allow '-' in database name 2006-05-21 11:04:56 +00:00
Rob Church 641a3f7bee (reopened bug 5185) Match on two or more slashes on the protocol to prevent another blacklist workaround 2006-04-28 23:18:47 +00:00