Commit graph

36 commits

Author SHA1 Message Date
Umherirrender 9733b3b40d Break long lines
Prepare to make phpcs pass

Change-Id: I814715c63af5ed5db673786e1fdab9fc59441b67
2017-06-03 13:13:31 +02:00
Kunal Mehta 637a7435ce Trigger Schema:ExternalLinksChange logging on page deletion
If a page is being deleted, use the ArticleDelete hook to queue a list
of URLs that are being "removed" from the page. The
ArticleDeleteComplete hook will trigger actually sending the logs - so
if something prevents the deletion, nothing will be logged.

Bug: T115119
Change-Id: I32e357bb88305a46251b05714a4ff75b75ae37aa
2016-09-29 14:14:01 +00:00
Kunal Mehta 2cac3f9ecd Fix Schema:ExternalLinksChange logging if no links are left on page
If there are no links left on the page, we would avoid invoking the spam
blacklist filtering entirely, to avoid having to do blacklist lookups,
etc. However, since we want link removal data, explicitly check for this
scenario and mark all current links as removals, and avoid invoking the
rest of the spam blacklist code.

Bug: T115119
Change-Id: I0bcd5b55594e38c0508b21db2c45e5136123efa0
2016-09-29 14:11:33 +00:00
Aaron Schulz f051498dcc Fix links passed to filter() for stashing to match edit checks
Change-Id: I8f1b2c4d3033015de0e9a6d58776fe0ad32c4775
2016-09-08 17:47:09 -07:00
Matthias Mullie 21de00842b Add 'message' property to API output
Bug: T141492
Change-Id: I357a563397cbc13c1cfbe3cbefb876408021ecb6
2016-08-17 11:53:47 +02:00
Kunal Mehta fb4dcf5565 Set $wgBlacklistSettings in extension.json
So people can actually override it without live hacking...

Change-Id: Id3b18b5b255fa1df321d34c4ce39849e1b545eec
2016-07-28 21:24:14 -07:00
Matthias Mullie dee68e3ab1 Filter file uploads
Bug: T134453
Change-Id: I140e8fec71e05db9e4625400e9a9dfe9a42d9635
2016-07-22 16:02:50 +02:00
jenkins-bot e5227407a8 Merge "Improve use of edit stash hook to check links" 2016-07-10 23:33:02 +00:00
Aaron Schulz d29aca496a Improve use of edit stash hook to check links
In the common case where no banned links were found, cache this
information to skip the checks on save.

Change-Id: I5f936622bc62d9fc905edaa2a69f52388c047d10
2016-07-10 16:17:51 -07:00
jenkins-bot 2e9ae98b8a Merge "Fix bugs in Schema:ExternalLinksChange code" 2016-07-06 02:04:41 +00:00
Brad Jorsch 93df3ed07a Use EditFilterMergedContent instead of APIEditBeforeSave hook
It was only needed for MediaWiki prior to 1.25
(09a5febb7b024c0b6585141bb05cba13a642f3eb).
We no longer support those versions after
5d882775f6.

Bug: T137832
Change-Id: I97f6a3c20476f1a42e3fadc701df5870a30c790c
2016-06-23 17:54:09 +00:00
Kunal Mehta a69fe26b94 Fix bugs in Schema:ExternalLinksChange code
* Have SpamBlacklist::doLogging() actually run
* Bump schema ID so userId property is an integer
* Don't try logging URLs that were unable to be parsed
* Make sure path/query/fragment are always strings

Bug: T115119
Change-Id: Ia81037e8939dd547f00e79c169fa84ca0a7b917e
2016-06-22 12:18:11 +02:00
Kunal Mehta 0d9494cc45 Fix file permissions
Change-Id: I19de3ded6b17cdde7edce45ba3dee4dccfd29725
2016-06-09 16:18:31 -07:00
Gergő Tisza 303ba31639 Update for AuthManager
Needs I8b52ec8ddf494f23941807638f149f15b5e46b0c to
do anything useful.

Bug: T110467
Change-Id: Ifb6fea581a0d0ae8db46e82b6fa6d25239cf3d8e
2016-05-11 22:32:49 +00:00
jenkins-bot 211e88c042 Merge "Log URL changes to EventLogging if configured" 2016-05-02 15:49:29 +00:00
Kunal Mehta 5910bfd7ba Log URL changes to EventLogging if configured
If enabled, changes in URLs on a page will be logged to the
"ExternalLinkChange" schema. To avoid extra lookups, the diff of URLs is
calculated during the filter step of the SpamBlacklist, and stored in
the SpamBlacklist instance state until the post-save hook is called, and
then they are queued to go to EventLogging.

Bug: T115119
Change-Id: I9a5378dca5ab473961f9fe8f7a6d929dc6d32bba
2016-04-25 17:54:48 +02:00
Aaron Schulz 2acfb30bfc Pre-cache the link list for external link filters
* This works via plugging into ApiStashEdit.
* The query is relatively slow per performance.wikimedia.org/xenon/svgs/daily/2016-02-15.index.svgz.

Change-Id: I0ad5289324b5482db7e2276f58fc1ac140250d47
2016-02-18 14:36:42 +00:00
Ori Livneh 55ce83b5e3 Don't check edits that don't contain links
If neither the edit body nor the edit summary contain any external links, bail
early rather than load and parse the blacklist.

Change-Id: I6863aa9618db4db05561253f7625fbe232222d3a
2015-11-27 20:13:27 -08:00
Aaron Schulz abb5df87d3 Actually use clearCache() instead of copy-pasting key names
* Also follows up 7a02693e9b due to key name changes

Change-Id: I8c15a6fa0f6b85016ee3a7882a0d40eb761897c7
2015-07-31 14:02:14 -07:00
paladox 5d882775f6 Add extensions.json, empty PHP entry point, remove i18n shim
Bug: T88059
Change-Id: I730a2012609f7dfac3d49012ae14038e6bcac3ae
2015-05-20 19:19:31 +01:00
Aaron Schulz ca55c42a1e Conversion to using WAN cache
Bug: T93141
Change-Id: I67fa3e6e6d348953472a565bdbeccd8298c80f58
2015-04-30 01:32:35 +00:00
Anomie 047f7a2318 Revert "Revert "Use WikiPage::prepareContentForEdit in SpamBlacklistHooks::filterMergedContent""
Ie71ebdeb should fix the bug in Wikibase that prompted the original
reversion.

This reverts commit 2745442aec.

Change-Id: I085996669db7e0fcbf839b8d38d020c8d2e09220
2014-12-17 11:17:12 -05:00
aude e42ba056f2 Don't generate html when calling getParserOutput
all that is needed are the external links

Bug: 67361
Change-Id: I8c44c8c306f5f20cb5edd8314bf53e3483e314b7
2014-07-01 17:22:37 +02:00
Faidon Liambotis f9e2fed9bf Revert "Categorize pages containing blacklisted links"
This resulted in doubling the appserver-memcached traffic across the
Wikimedia cluster.

This reverts commit 32b546a223.

Change-Id: I03e96a1bb223360e62d47f98a505cc5b26e5aadf
2014-03-31 09:06:56 +03:00
Jackmcbarn 32b546a223 Categorize pages containing blacklisted links
Add pages containing links that match the spam blacklist to a tracking
category.

Change-Id: I694860bc77d05dccd81522efc23225481d51ee43
2014-03-11 11:23:38 -04:00
Aude 2745442aec Revert "Use WikiPage::prepareContentForEdit in SpamBlacklistHooks::filterMergedContent"
For Wikidata, this causes users to be unable to create items. (bug 59797)

In case of Special:NewItem, $context->getWikiPage() does
$context->getTitle() to get the special page and then WikiPage::factory
with the special page $title.  special page does not make a valid
WikiPage and throws an exception.

Previously, $title = special page and was used in getParserOutput.
$title is invalid there but Wikibase never used that variable.

Would be good to make sure $title is actually related to the content.

This reverts commit 508a3706d6.

Change-Id: Ib24515be6e76d3f29e2f9048fbb81e5a25b5857a
2014-01-07 21:57:47 +00:00
Brad Jorsch 508a3706d6 Use WikiPage::prepareContentForEdit in SpamBlacklistHooks::filterMergedContent
Using WikiPage::prepareContentForEdit instead of
Content::getParserOutput allows us to share the cached parser output
with other hooks that run during the edit process.

Note SpamBlacklistHooks::filterAPIEditBeforeSave already does this.

Bug: 57026
Change-Id: I8c8b293af2842411fd95d3bc21e966a72b2a78b4
2013-12-14 10:24:33 -05:00
daniel a3defb8b91 (bug 51621) Make SBL aware of ContentHandler.
This changes SpamBlacklist to make use of the new, ContentHandler
aware hooks.

This change also includes some refactoring and cleanup which made
the migration to the new hooks easier.

Change-Id: I21e9cc8479f2b95fb53c502f6e279c8a1ea378a5
2013-08-24 19:55:55 +02:00
Siebrand Mazeland 2e6259f35f (bug 45461) Use email instead of e-mail
Change-Id: Ibad4c4c7c98f5b0c746d62f1b2b41a4a99360ee8
2013-02-27 12:50:47 +01:00
Siebrand Mazeland e9874344aa Maintenance for SpamBlacklist extension.
* Replace deprecated methods. MediaWiki 1.19 required.
* Replace <tt> with <code>.
* Update documentation.
* Use WikiPage instead of Article for doEdit().
* Use __DIR__ instead of dirname( __FILE__ ).
* Remove superfluous newlines.

Change-Id: I3a0e42ca404638f7c7934c316735ad11cbc99d42
2012-09-03 16:50:18 +02:00
jarry1250 20058848ab Other half of fix for bug #30332 ("API spamblocklist error should
provide all blocked URLs").

SpamBlacklist extension to provide all matched URLs to
spamPageWithContent() rather than just one. Performance
hit negligible and zero for all edits that don't hit the
SpamBlacklist (99.999%+).

DEPENDENT ON OTHER HALF OF FIX (now in core):
https://gerrit.wikimedia.org/r/3740

Change-Id: Ia951d5795c5cedb6c3876be89f8a08f110004102
2012-03-27 21:42:49 +01:00
Max Semenik 4a433ff423 Fix r109111: no point in aborting hook execution 2012-02-15 14:58:26 +00:00
Sam Reed f229bcf0fb Fix fixme on r109111 per Tbleher 2012-02-02 22:12:43 +00:00
Robin Pepermans ed81c5b979 Follow-up r109455: make it clear that it's about e-mail *addresses*, also fix consistency: email -> e-mail and E-mail -> e-mail. 2012-01-21 15:05:49 +00:00
John Du Hart aaf4d74d18 Adding Email blacklisting to the SpamBlacklist extension
This relies on r109111
2012-01-18 23:29:37 +00:00
John Du Hart 62b2bde146 Refactored SpamBlacklist to be extendable for other blacklist types
This is the groundwork for Bug 33761
2012-01-17 06:13:46 +00:00