See, this last part of the compiled regular expression is wrapped in
an (…)*, which means it is entirely optional. It does not make any
difference if this part is found or not. The compiled regular
expression matches with or without any of these "line ending"
fragments being present.
I can not really figure out what the intention of this was. A line
ending anchor ($) is not missing – I'm pretty sure about this.
Otherwise it could not detect signatures that are wrapped in more
than a single HTML tag, for example.
Instead of fixing it I decided to remove it. The tests should show
this code was not needed.
The motivation for this patch is to improve performance. This part of
the regular expression is quite heavy and can cause a lot of
backtracking for literally zero benefit.
Bug: T203930
Bug: T204291
Change-Id: Ia5323b401b947edeb7094d7eec131ba6c80edf70
\h matches only horizontal whitespace, but no newlines. This is what
we want in all these cases, because nothing of this (headlines,
signatures, timestamps) is even allowed to span multiple lines in
wikitext. The tests should show this still succeeds.
The idea is to make these regular expressions more strict so they
don't run in so much expensive backtracking.
Bug: T203930
Bug: T204291
Change-Id: I805f8cb082edcd26713ef41d3ae5b61194c131e5
In double quoted strings PHP tries to understand all kinds of escape
sequences, but \A is not one of them. Such sequences are left untouched,
including the backslash.
In single quoted strings, there are no escape sequences. All are left
untouched, which is what we want in case of a regular expression.
TL;DR: The resulting string is the same in both cases. I'm touching this
because my IDE shows a warning about the unknown \A escape sequence.
It must be either turned into "\\A" or '\A'.
Change-Id: Ie1e84c67c344faf77bc86a0b28dc82d31c3a7dbe
This patch adds a few strict type hints on the language level, not
only on the PHPDoc level as my other patches do.
Change-Id: Ie66f9ebf80317dcaf13e2e96a93332a1a93cebbe
This change requires MediaWiki 1.32 which is already required in
extension.json.
Change-Id: I61856796d864c9493c1a7a875cb2415f11f081a9
Depends-On: I193f5b9a95430b0a05573c361715e053e5411e32
Most modern IDEs as well as documentation generators understand the
keywords "false" and "true", when a bool can only be one of the two.
Change-Id: I83dd1f0cc0802fa74ee35e7ca7425615230a767f
There are about 200 of such generic "array" type hints in this code base,
the majority in @param tags. I started with what I found most relevant:
@var and @return tags. I might continue working on this later, but
wanted to stop for now to keep this patch moderately small.
Change-Id: Iff0d9590a794ae0f885466ef6bb336b0b42a6cd3
Tested with the quick preview (Ctrl+Q) feature in PHPStorm.
I'm also updating a few type hints I could not split off into a separate
patch, because the lines are to close to each other.
Change-Id: I312ec601a5f443c2b12515e34c574b8889c4c128
Explaining that a variable named "$username" contains a "username" is
not helpful. One have to read this comment first to understand that it
does not add anything to what's already obvious from the variable name
and the type.
Change-Id: I9a43866498d0c94422caf16233f502320a8e36c9
See I2291c69d9df17c1a9e4ab1b7d4cbc73bc51d3ebb for the anticipated
hard-deprecation of this method in core.
Bug: T197492
Change-Id: I4687db09c27480147cfa7a648a886b1670812deb
See MediaWiki core change Ied5fe1a61. There's no need for a dependency
here, though, since it'll just ignore the extra parameter.
Change-Id: Iff28b00638c15de7307a130196bbb91cda91c3d1
User objects haven't been stubbed in awhile, and language objects
aren't being stubbed anymore.
While we're here, swap a few MWException -> InvalidArgumentException
since they're more accurate :)
Change-Id: I7e2f2aa135b024fb653c3ec13181d7015383ff2f
The latter doesn't trim(), so add trim() calls in some cases.
User input is trimmed, parsed i18n messages are not.
Change-Id: I933a6a929bf7d3e2d1623ea537227dc8c731cb6f
When content is changed and the change contains the signature
of the user, the method checking for reasonable mentions in
that changes did not consider multiple signatures.
The patch fixes that and adds a test for it.
Bug: T154406
Change-Id: I86303f42e97d16c68e3235b0e2d13542ceedf1fe
This sends out a notification when a user gets mentioned in a change as
long as a signature is added in the same section.
Bug: T138938
Change-Id: Ie183fbb8150bd9451a5b0a9fea0227e3241b26a0
This patch fixes mentions not being send when multiple sections were added
in between sections.
Since we only want to send mentions when userlinks and signature are present
in the same section a new method was added extracting sections and the related
content from an addition. The results are checked whether a section content
contains a signature and might be relevant for mentions.
Bug: T141863
Change-Id: I434c664552bbadbeef6e897e20703e813f5a4c52
This logs whenever a user gets mentioned in a change as
long as a signature is added in the same edit.
Bug: T138938
Change-Id: I2a775d1dcac6a947b353c8bd2f7be70b6384641f
This patch logs multiple section edits that could trigger mentions.
Since we only want to send mentions when userlinks and signature are present
in the same section a new method was added extracting sections and the related
content from an addition. The results are checked whether a section content
contains a signature and might be relevant for mentions.
Bug: T141863
Change-Id: Ib06cd855b2c7fbd51d8ab6602882cb38aadf8350
While manual rebasing the bundle patch the wrong
line was removed.
Also improves tests to check for notifyAgent.
See I1069aeb5523db8710da4e8e21065bf447d031e3c
Change-Id: I33ddeccea153d6f6ae97e5c60e8b47dc24fb4833
Adds common bundling including messages and icons.
Bundling relates to revision now.
Changed order how notifications are generated. Now errors will
show first, since they are generated last.
Bug: T140224
Change-Id: I1069aeb5523db8710da4e8e21065bf447d031e3c
This will allow us to:
1) Fix a bug involving showing the sig in the snippets
of mentions (Something catrope mentioned to me but
I do not know of a bug number for it)
2) Send more accurate sameUser failure metrics to
graphite as signature links would never be counted
as a mention
Change-Id: I33677012673ae6e4665aaaf59d4f350602f7276a
Adds new notification type and icon for successful mentions.
Complements existing test to consider successful mentions.
Bug: T139623
Change-Id: I7a77b40e8b14c95cadb9023065ee916247feacf9
- Adds global "$wgEchoMentionStatusNotifications"
to activate mention status notifications.
(must be set before extension is loaded)
- Adds notification types and icon for some basic mention
failures.
- Adds failure and stats for anonymous IP.
- Adds check for links to user subpages.
- Adds config var for max mention notifications allowed.
- Bundles notifications.
- Refactors test for the event generation and adds tests
for unknown users, user links with subpages and failures
for too many mentions.
Bug: T136326
Change-Id: I388bdc3714feb9a2865a5ad10dbeabb0a6a09a4f
It's probably not realistically possible for a revision to be oversighted
by the time generateMentionEvents() runs, but for consistency
we should be using RAW here.
Change-Id: If73b4abe5fbae5cadb75c5e09137299873f2a764
Right now we don't actually know how many times
each of these cases happen so add some basic tracking
so we can make some informed decisions.
Bug: T135719
Change-Id: Id4d519aefe96ecca2e3c51dd1c8128de70d0caac
Formatters based on presentation models for
individual event emails and digest (daily, weekly)
plain text emails.
Bug: T121067
Change-Id: I4eceaf521315adab7429a8a73ffca70ebcddab86
formatSummary() was first parsing the summary using the
summary parser, then handing off the resulting HTML to
getTextSnippet() which parsed it again with the normal parser.
Bug: T131087
Change-Id: I2724ccb7c23579b3f02dea57d4fc833079169adf
The previous implementation did the following weird things:
* Stripped tags before parsing
* Stripped templates before parsing using a hacky while loop
that bails after ten attempts
* Decoded entities using htmlspecialchars_decode(), while
html_entity_decode() makes more sense
* ...which meant it had to manually convert   back
to spaces, which is not necessary if you use html_entity_decode()
* Removed any single braces ('{' and '}') from the output
* Rejected the entire output if there were any entities left,
which is fairly likely since htmlspecialchars_decode()
only decodes a few of them
Instead of all this, just parse, strip tags, decode entities
(all of them, not just a few), trim and truncate. In particular,
don't strip templates, because we use getTextSnippet() in mention
notifications, which look weird when {{ping}} templates are stripped.
Bug: T129531
Change-Id: I956b2f6badc40d2f5bf90a0458ccab8b8fc6fefb
getTextSnippet() has a `Language` type hint that will fatal if $wgLang
is a StubUserLang object, so make sure we unstub it if nothing else
already has.
Bug: T118542
Change-Id: I847680074fbbf95bbe3b6002151d2a18c45ebe6e
To avoid using $wgLang directly. We still have to use it in
detectSectionTitleAndText for now though.
Change-Id: Ic901ed05d4e8f6291caa55d866ce58f7300880f5
This preference has been disabled since bug 47562, and doesn't make
sense to keep around given that the flyout is the main interaction most
users have with Echo.
Change-Id: I7e8ddf96dbde9a95ac01a0cc83bad396151d01bd
Pull out the logic that extracts usernames from links. This allows
it to be reused by the LQT->Flow import code.
Bug: T101979
Change-Id: Ib16a09cf1f388f56944cd1bb564384535728156e
* Do not default section to footer. If the section
is not found, it is left empty and the notification
message is simpler.
* Change notification-edit-talk-page-email-batch-body2
Replace : at the end with . so it does not look
incomplete.
Bug: T99989
Change-Id: Ic982a81eada388d750760787245dea8f72368147
Link the bottom of the talk page and use the edit summary as text
if the parser failed to find something. This is what core's enotif
does already.
Change-Id: Iadc7011ea2627e00f0c51472da7aad1355afeddb
* Parser generates signature to compare against
* Signature can be overwritten per wiki, in NS_MEDIAWIKI
* Such overwritten default can be different depending on
page the signature is on[1]
* Our comparison signature generation was page-agnostic
(always from Title::newMainPage)
* Signatures didn't match up on own talk pages, where
default signature is different
Also added 2 new tests cases & improved tests by also
setting the page
1: https://en.wikipedia.org/w/index.php?title=MediaWiki%3ASignature&diff=176507985&oldid=176229132
Bug: T78424
Change-Id: Ice151d4d16236a5d1556ef62805b61310c7beb85
Previously, there were a couple of hacks in play.
It was also not picking up ~~~ (signature without timestamp)
And it relied an a nasty regular expression which, although
based on Parser, may some day get out of date.
And it relied heavily on a specific signature format, which
isn't guaranteed (it's an i18n msg)
This patch changes the approach: it will use a very simple
regex to match links, and will send those through Parser to
generate the signature anew. My reasoning is that that should
be exactly the same as what Echo just received (should've
also gone through parser)
Biggest discomfort of this approach is that it's much stricter.
It should still match whatever it generated from a ~~~ or ~~~~,
but no longer the e.g. not-real signatures we were doing in
our tests. Also had to update our tests, because signatures
change depending on anon. So I had to generate all the users.
And fix some of the signature formats used in the tests.
Bug: T75426
Bug: T87852
Bug: T75366
Bug: T78424
Change-Id: Ibeff36397129fdd5d376f3668a23a45f9a014525
In some languages the \w+ does not match the characters used
when translating UTC and the regular expression attempting to
match the timezone fails. Testing in prod wikis where this fails
such as ne.wikipedia.org shows it still works, it just generates
a more generic regular expression.
Since the overall process still works acceptably on the wikis outputting
warnings this patch just adds a guard to prevent the warning and does
not attempt to fix the underlying issue.
Bug: T76558
Change-Id: If8e1ddd2d642b042cc24c51d5ba5aa8b34bc9552
There were two different circumstances that could trigger echo's signature
detection to fail: multibyte characters in signature, and signatures near
$wgMaxSigChars limit that expanded past the limit due to wfEscapeWikiText().
This patch adjusts to use mb_substr to appropriatly handle the multibyte
characters, and adds a couple extra charactesr to $wgMaxSigChars to allow
for wfEscapeWikiText(). This isn't perfect, but a stricter implementation
would require much more work than i think we should spend here.
Bug: 73426
Change-Id: Ic51c2bc2a08600f188db13a9a0537f1321c9a655
Currently echo attempts to find a signature by looking for a series of
strings starting with what it thinks are the current aliases of NS_USER
and NS_USER_TALK. This has shown to be error prone, see the linked bug
for how a change to ru.wikipedia.org/wiki/Mediawiki:Signature broke
mention notifications.
Patch switches things arround to pull wikilinks out of the text and run
them through the Title class. The results of this parsing are checked
for NS_USER and NS_USER_TALK, giving a much stronger guarantee of finding
translated namespaces.
Bug: 71353
Change-Id: Ib0d0f4e068339d2fd28761087c05f5a1acb3c1fc
General code cleanup as reported by the PHPStorm static code
analysis. I hope it's not a problem that I made a lot of very
different (but all very tiny) changes in a single patch. If you
want to merge this but you think it's better to split it into
several patches first, please tell me.
Change-Id: I2e2c4bb47f8d20e038d28e236e2ff813b30504af
The code was looking at the [0] element for the matched position
of timestamps, while preg_match returns it in the [1] element.
Bug: 53132
Change-Id: Ibfd3f2b86b007f28f73a137defb80276fb830d28
Follows-Up: I6c636b055bcd25760aee848aea71fe4044c7e1be