If a page is being deleted, use the ArticleDelete hook to queue a list
of URLs that are being "removed" from the page. The
ArticleDeleteComplete hook will trigger actually sending the logs - so
if something prevents the deletion, nothing will be logged.
Bug: T115119
Change-Id: I32e357bb88305a46251b05714a4ff75b75ae37aa
If there are no links left on the page, we would avoid invoking the spam
blacklist filtering entirely, to avoid having to do blacklist lookups,
etc. However, since we want link removal data, explicitly check for this
scenario and mark all current links as removals, and avoid invoking the
rest of the spam blacklist code.
Bug: T115119
Change-Id: I0bcd5b55594e38c0508b21db2c45e5136123efa0
In the common case where no banned links were found, cache this
information to skip the checks on save.
Change-Id: I5f936622bc62d9fc905edaa2a69f52388c047d10
It was only needed for MediaWiki prior to 1.25
(09a5febb7b024c0b6585141bb05cba13a642f3eb).
We no longer support those versions after
5d882775f6.
Bug: T137832
Change-Id: I97f6a3c20476f1a42e3fadc701df5870a30c790c
* Have SpamBlacklist::doLogging() actually run
* Bump schema ID so userId property is an integer
* Don't try logging URLs that were unable to be parsed
* Make sure path/query/fragment are always strings
Bug: T115119
Change-Id: Ia81037e8939dd547f00e79c169fa84ca0a7b917e
If enabled, changes in URLs on a page will be logged to the
"ExternalLinkChange" schema. To avoid extra lookups, the diff of URLs is
calculated during the filter step of the SpamBlacklist, and stored in
the SpamBlacklist instance state until the post-save hook is called, and
then they are queued to go to EventLogging.
Bug: T115119
Change-Id: I9a5378dca5ab473961f9fe8f7a6d929dc6d32bba
* This works via plugging into ApiStashEdit.
* The query is relatively slow per performance.wikimedia.org/xenon/svgs/daily/2016-02-15.index.svgz.
Change-Id: I0ad5289324b5482db7e2276f58fc1ac140250d47
If neither the edit body nor the edit summary contain any external links, bail
early rather than load and parse the blacklist.
Change-Id: I6863aa9618db4db05561253f7625fbe232222d3a
Ie71ebdeb should fix the bug in Wikibase that prompted the original
reversion.
This reverts commit 2745442aec.
Change-Id: I085996669db7e0fcbf839b8d38d020c8d2e09220
This resulted in doubling the appserver-memcached traffic across the
Wikimedia cluster.
This reverts commit 32b546a223.
Change-Id: I03e96a1bb223360e62d47f98a505cc5b26e5aadf
For Wikidata, this causes users to be unable to create items. (bug 59797)
In case of Special:NewItem, $context->getWikiPage() does
$context->getTitle() to get the special page and then WikiPage::factory
with the special page $title. special page does not make a valid
WikiPage and throws an exception.
Previously, $title = special page and was used in getParserOutput.
$title is invalid there but Wikibase never used that variable.
Would be good to make sure $title is actually related to the content.
This reverts commit 508a3706d6.
Change-Id: Ib24515be6e76d3f29e2f9048fbb81e5a25b5857a
Using WikiPage::prepareContentForEdit instead of
Content::getParserOutput allows us to share the cached parser output
with other hooks that run during the edit process.
Note SpamBlacklistHooks::filterAPIEditBeforeSave already does this.
Bug: 57026
Change-Id: I8c8b293af2842411fd95d3bc21e966a72b2a78b4
This changes SpamBlacklist to make use of the new, ContentHandler
aware hooks.
This change also includes some refactoring and cleanup which made
the migration to the new hooks easier.
Change-Id: I21e9cc8479f2b95fb53c502f6e279c8a1ea378a5
provide all blocked URLs").
SpamBlacklist extension to provide all matched URLs to
spamPageWithContent() rather than just one. Performance
hit negligible and zero for all edits that don't hit the
SpamBlacklist (99.999%+).
DEPENDENT ON OTHER HALF OF FIX (now in core):
https://gerrit.wikimedia.org/r/3740
Change-Id: Ia951d5795c5cedb6c3876be89f8a08f110004102