mirror of
https://gerrit.wikimedia.org/r/mediawiki/extensions/SpamBlacklist
synced 2024-11-27 16:40:04 +00:00
Merge "Partially update README"
This commit is contained in:
commit
1cb903fc09
52
README
52
README
|
@ -3,33 +3,39 @@ MediaWiki extension: SpamBlacklist
|
|||
|
||||
SpamBlacklist is a simple edit filter extension. When someone tries to save the
|
||||
page, it checks the text against a potentially very large list of "bad"
|
||||
hostnames. If there is a match, it displays an error message to the user and
|
||||
hostnames. If there is a match, it displays an error message to the user and
|
||||
refuses to save the page.
|
||||
|
||||
To enable it, first download a copy of the SpamBlacklist directory and put it
|
||||
into your extensions directory. Then put the following at the end of your
|
||||
into your extensions directory. Then put the following at the end of your
|
||||
LocalSettings.php:
|
||||
|
||||
require_once( "$IP/extensions/SpamBlacklist/SpamBlacklist.php" );
|
||||
wfLoadExtension ( 'SpamBlacklist' );
|
||||
|
||||
To users running MediaWiki 1.24 or earlier:
|
||||
|
||||
The instructions above describe the new way of installing this extension using wfLoadExtension().
|
||||
If you need to install this extension on these earlier versions (MediaWiki 1.24 and earlier),
|
||||
instead of `wfLoadExtension( 'SpamBlacklist' );`, you need to use:
|
||||
|
||||
require_once "$IP/extensions/SpamBlacklist/SpamBlacklist.php";
|
||||
|
||||
The list of bad URLs can be drawn from multiple sources. These sources are
|
||||
configured with the $wgSpamBlacklistFiles global variable. This global variable
|
||||
can be set in LocalSettings.php, AFTER including SpamBlacklist.php.
|
||||
|
||||
$wgSpamBlacklistFiles is an array, each value containing either a URL, a filename
|
||||
$wgSpamBlacklistFiles is an array, each value containing either a URL, a filename
|
||||
or a database location. Specifying a database location allows you to draw the
|
||||
blacklist from a page on your wiki. The format of the database location
|
||||
specifier is "DB: <db name> <title>".
|
||||
|
||||
Example:
|
||||
|
||||
require_once( "$IP/extensions/SpamBlacklist/SpamBlacklist.php" );
|
||||
$wgSpamBlacklistFiles = array(
|
||||
"$IP/extensions/SpamBlacklist/wikimedia_blacklist", // Wikimedia's list
|
||||
|
||||
// database title
|
||||
"DB: wikidb My_spam_blacklist",
|
||||
);
|
||||
wfLoadExtension ( 'SpamBlacklist' );
|
||||
$wgSpamBlacklistFiles = [
|
||||
"$IP/extensions/SpamBlacklist/wikimedia_blacklist", // Wikimedia's list
|
||||
"DB: wikidb My_spam_blacklist", // database (wikidb), title (My_spam_blacklist)
|
||||
];
|
||||
|
||||
The local pages [[MediaWiki:Spam-blacklist]] and [[MediaWiki:Spam-whitelist]]
|
||||
will always be used, whatever additional files are listed.
|
||||
|
@ -63,16 +69,16 @@ Internally, a regex is formed which looks like this:
|
|||
|
||||
A few notes about this format. It's not necessary to add www to the start of
|
||||
hostnames, the regex is designed to match any subdomain. Don't add patterns
|
||||
to your file which may run off the end of the URL, e.g. anything containing
|
||||
to your file which may run off the end of the URL, e.g. anything containing
|
||||
".*". Unlike in some similar systems, the line-end metacharacter "$" will not
|
||||
assert the end of the hostname, it'll assert the end of the page.
|
||||
|
||||
Performance
|
||||
-----------
|
||||
|
||||
This extension uses a small "loader" file, to avoid loading all the code on
|
||||
every page view. This means that page view performance will not be affected
|
||||
even if you are not running a PHP bytecode cache such as Turck MMCache. Note
|
||||
This extension uses a small "loader" file, to avoid loading all the code on
|
||||
every page view. This means that page view performance will not be affected
|
||||
even if you are not running a PHP bytecode cache such as Turck MMCache. Note
|
||||
that a bytecode cache is strongly recommended for any MediaWiki installation.
|
||||
|
||||
The regex match itself generally adds an insignificant overhead to page saves,
|
||||
|
@ -80,7 +86,7 @@ on the order of 100ms in our experience. However loading the spam file from disk
|
|||
or the database, and constructing the regex, may take a significant amount of
|
||||
time depending on your hardware. If you find that enabling this extension slows
|
||||
down saves excessively, try installing MemCached or another supported data
|
||||
caching solution. The SpamBlacklist extension will cache the constructed regex
|
||||
caching solution. The SpamBlacklist extension will cache the constructed regex
|
||||
if such a system is present.
|
||||
|
||||
Caching behavior
|
||||
|
@ -103,7 +109,7 @@ Stability
|
|||
---------
|
||||
|
||||
This extension has not been widely tested outside Wikimedia. Although it has
|
||||
been in production on Wikimedia websites since December 2004, it should be
|
||||
been in production on Wikimedia websites since December 2004, it should be
|
||||
considered experimental. Its design is simple, with little input validation, so
|
||||
unexpected behavior due to incorrect regular expression input or non-standard
|
||||
configuration is entirely possible.
|
||||
|
@ -114,18 +120,18 @@ Obtaining or making blacklists
|
|||
The primary source for a MediaWiki-compatible blacklist file is the Wikimedia
|
||||
spam blacklist on meta:
|
||||
|
||||
http://meta.wikimedia.org/wiki/Spam_blacklist
|
||||
https://meta.wikimedia.org/wiki/Spam_blacklist
|
||||
|
||||
In the default configuration, the extension loads this list from our site
|
||||
In the default configuration, the extension loads this list from our site
|
||||
once every 10-15 minutes.
|
||||
|
||||
The Wikimedia spam blacklist can only be edited by trusted administrators.
|
||||
Wikimedia hosts large, diverse wikis with many thousands of external links,
|
||||
hence the Wikimedia blacklist is comparatively conservative in the links it
|
||||
The Wikimedia spam blacklist can only be edited by trusted administrators.
|
||||
Wikimedia hosts large, diverse wikis with many thousands of external links,
|
||||
hence the Wikimedia blacklist is comparatively conservative in the links it
|
||||
blocks. You may want to add your own keyword blocks or even ccTLD blocks.
|
||||
You may suggest modifications to the Wikimedia blacklist at:
|
||||
|
||||
http://meta.wikimedia.org/wiki/Talk:Spam_blacklist
|
||||
https://meta.wikimedia.org/wiki/Talk:Spam_blacklist
|
||||
|
||||
To make maintenance of local lists easier, you may wish to add a DB: source to
|
||||
$wgSpamBlacklistFiles and hence create a blacklist on your wiki. If you do this,
|
||||
|
|
Loading…
Reference in a new issue