Commit graph

64 commits

Author SHA1 Message Date
Thiemo Kreuz 28dd373d24 Move misplaced ParserFirstCallInit hook handler to CiteHooks
All other hook handlers are in the dedicated CiteHooks class.

Main motivation here is to make the huge Cite class smaller,
especially by removing static code that does not rely on anything
else the class does.

Bug: T236260
Change-Id: If0b3f6c989e44283428cda4b2c4d8d5303385d22
2019-10-25 10:34:35 +02:00
jenkins-bot 88266ade91 Merge "Refine some workflow related comments" 2019-10-24 13:07:55 +00:00
WMDE-Fisch 9196ccead7 Refine some workflow related comments
Change-Id: Ib7a6c4cc085d91fe27c96cbfd9c7035465149319
2019-10-24 14:38:46 +02:00
Adam Wight 5e8d48b331 Minimal support for bookreferencing tag
Allows the "refines" attribute when the feature flag is set, but doesn't
render.  This is part of our rollback strategy, so that we aren't left
with invalid wikitext in case of undeployment.

Bug: T236257
Change-Id: I936be0e62dccb46caeb84162d2c5166956fd9916
2019-10-24 12:24:36 +00:00
Thiemo Kreuz 3e2d1a23e0 Fix all PHPCS issues and add missing array type hints
* I used https://codesearch.wmflabs.org to make sure the private
constants are indeed not used anywhere.

* The added type hints are safe, as far as I can tell. There is no way
one of these parameters can contain anything else. Otherwise the code
would fail already.

Change-Id: Iaa7615e9864805760fa652700b58b69680b4f17e
2019-10-17 09:23:20 +02:00
Adam Wight 741f5dcdaa Bundle tracking with another RL module
This is slightly more efficient because it saves on early page-
load bandwidth.

Bug: T234605
Change-Id: If83420a9b4e654fd790e810fa82f922a8ba06e50
2019-10-10 10:44:50 +02:00
Adam Wight c12150082c Baseline reference interaction tracking
Collect EventLogging metrics for footnote and reference link
interactions, so that we can compare behavior with and without
Reference Previews enabled.

This tracking will be reverted once analysis is complete.

A mostly arbitrary sample rate of 1/1000 is hardcoded here.  This is
loosely based on the latest tuning of Popups sampling at 1/100,
divided by a conservative factor of 10 to ensure headroom.

The sample is skewed by skipping clients without sendBeacon support,
but we're avoiding the mw.track synchronous fallback, which injects an
image tag and introduces lag any time the user clicks external links
in the references.

Bug: T231529
Change-Id: Iad32b64114f88675eecbb01712418c968e3cf661
2019-10-01 10:23:31 +02:00
Derick Alangi 35e6993ffc Avoid usage of deprecated $wgContLang global (dep in 1.32)
Change-Id: Ib3972dc0a7ef3dad4db64e268b613dfafb18a242
2019-09-02 19:53:34 +01:00
WMDE-Fisch e0602be6a9 Remove warning about hard-coded class name
Core now uses the extension name to check if the Cite extension is
loaded. Therefore the class name could be changed in that regard.

Change-Id: Ibdc0725045f7a0b0afcbf6cb94ccdab9509ad672
Depend-On: I35e5aa9955141b575de68a5be2c0d5b87585eb77
2019-08-14 19:17:24 +02:00
James D. Forrester bceb94ca2a build: Upgrade phan-taint-check-plugin from 1.5.x to 2.0.1
Change-Id: I909d3cfa726d7b68b5a580baf00f6746d4689404
2019-07-10 16:31:30 +00:00
Kunal Mehta 45c01a6b78 Upgrade to newer phan
Bug: T216911
Change-Id: Ib228ac26a9a87c51a107407b6162110681b5e75c
2019-03-17 16:46:06 -07:00
Thiemo Kreuz 3f22189998 Fix <ref> ignoring all parameters when there are more than two
We can resolve this bug by either replacing the bogus "return false"
with the intended "return [ false, … ]". Or rely on the code a few
lines below that also bails out with a "return [ false, … ]" when to
many parameters ($cnt is not 0 then) are present. The tests prove both
solutions are equally valid.

Bug: T211576
Change-Id: Iadd55c134dede7042cfd152c69bc8f27b59d8912
2018-12-11 20:49:40 +01:00
Thiemo Kreuz (WMDE) 06b821a451 Rewrite private Cite::refArg for readability
The biggest issue with this code was that it was tracking the exact
same state in two ways: Processed array elements got removed from the
$argv array, *and* the $cnt was decremented the same time. This is not
necessary and a potential source of confusion and errors.

I carefully transformed the code. I'm sure it still does exactly what
it did before. The tests should prove this.

Change-Id: I642d38e7944aa3e2239179fa58e1e231b4698263
2018-12-11 17:58:19 +01:00
jenkins-bot 9e981d28b6 Merge "Sanitize underscores as core does, to not create broken links" 2018-12-11 00:01:57 +00:00
jenkins-bot c6e13db74f Merge "Simplify weirdly complex [\n\t ] regex" 2018-11-30 23:50:55 +00:00
Thiemo Kreuz (WMDE) 8760fe5e62 Rearrange Cite::listToText for performance
The two message are not needed in case there is only one element.
Since fetching messages is a little expensive, I feel it's worth
rearranging this code.

Or not, because it seems this method is never called with one
element only.

Change-Id: Ie915278b41f053afe0d14a29d2aec54c98e5185e
2018-11-21 18:16:55 +00:00
jenkins-bot 1fe6947693 Merge "Use \d shortcut in regular expressions" 2018-11-21 18:13:00 +00:00
jenkins-bot 7c7f258aac Merge "Remove unused parameter from two private Cite functions" 2018-11-21 18:03:30 +00:00
Thiemo Kreuz (WMDE) 436453e758 Use \d shortcut in regular expressions
This does not change anything, but is only for readability.

Change-Id: I7970df426246c16f5418665c9af069c13e0b9933
2018-11-21 17:59:42 +00:00
jenkins-bot ea5c84fdaf Merge "Use more specific type hints in PHPDoc tags" 2018-11-21 17:44:55 +00:00
Thiemo Kreuz (WMDE) 7c06347fc7 Simplify weirdly complex [\n\t ] regex
This change does have two consequences:

1. A few more whitespace characters act as separators. This should not
have any consequence in real life situations, and is mainly done to
make the code easier to read and less surprising.

2. Sequences of two or more whitespace characters previously resulted
in partly *empty* results. This was a potential source of errors. The
additional + fixes this.

Change-Id: Ib58326109c740dd0cbd05d8fddb4af2145f232fe
2018-11-21 17:33:25 +00:00
Thiemo Kreuz (WMDE) 0f76d79169 Use more specific type hints in PHPDoc tags
Change-Id: Ib0cf532fa51ddce135914edf357638a0862a200f
2018-11-21 18:19:25 +01:00
jenkins-bot 1bc1d1e075 Merge "Prefer "=== null" for consistency and readability" 2018-11-21 02:41:18 +00:00
Thiemo Kreuz (WMDE) 2b34dede6c Sanitize underscores as core does, to not create broken links
Core sanitizes link targets and removes double spaces and underscores.
But the corresponding id="…" attributes are not sanitized the same
way. This results in broken links. This patch is not perfect (two
references with name="a_b" and name="a__b" will conflict), but the
best solution I can think of at the moment.

Bug: T184912
Change-Id: I9dbc916ad99269517d84c8ffb8581628d44a9f4e
2018-11-20 13:07:35 +01:00
Thiemo Kreuz (WMDE) 30c6323e82 Prefer "=== null" for consistency and readability
Change-Id: I59a7d072ec859f59ff6692d7fa79708ec184322e
2018-11-20 11:48:19 +01:00
Thiemo Kreuz ee8da566e3 Highlight backreference jump marks by making them bold
The separate "ext.cite.a11y" module is kept for (temporary)
compatibility with cached HTML, and should be removed in about
a month.

Browser tests will be added in a separate patch.

Bug: T205270
Change-Id: I26fe41c328157233cc5b06d38d2ba0f7b036a853
2018-11-19 16:46:08 +01:00
Thiemo Kreuz (WMDE) a8da30d4fd Remove unused parameter from two private Cite functions
The iodea is to make the code simpler and easier to read. If no
code uses this feature, all it does is making the code unnecessary
complex.

Change-Id: I22747712a691443a29b57831d3a6926275ad986b
2018-11-19 16:28:24 +01:00
Pipix 978816149c Convert HTTP Links To HTTPS
1 link to www.mediawiki.org is converted from http:// to https://

Bug: T189687
Change-Id: I24e08db41b4d46abea356cd05c37916f740a1d7a
2018-10-30 20:10:17 +08:00
Gergő Tisza b8efcb0e1a Unstrip <ref> contents before comparing
When used as {{#tag:ref|...}}, references can contain strip markers,
and different strip markers can hide the same text, so unstrip before
comparing to avoid false warnings.

Bug: T205803
Change-Id: I059fd853d1eea07aa06cc85f80e463dd97fd171a
2018-10-15 12:10:59 -04:00
jenkins-bot 1c67723bfe Merge "Make Cite pass phan-taint-check" 2018-09-13 03:04:20 +00:00
Brian Wolff 9ce0be6a78 Make Cite pass phan-taint-check
Because of how arrays are handled, phan-taint-check thought all
return values from refArg() were escaped, where really only $dir
was. We also split the error method into the parse and noparse
case as separate functions so that phan can better analyse these calls.
In linkRef() we suppress the double escaping as the escaping used
is appropriate for inserting into wikitext.

Bug: T195009
Change-Id: I3e04c8cceae727e5470d4ae4fdb2404639f9bf33
2018-09-13 01:10:59 +00:00
Ed Sanders 3a2b025e07 Convert bugzilla numbers to phab task numbers
Change-Id: I30e8c8d9eaff47185a61a093787cdfd25b3889d8
2018-09-12 16:48:17 +00:00
Umherirrender 39aa50cb80 Remove @static doc annotations
@static is intended for use only when the language does
not support the concept of static methods natively

Change-Id: I9a0bf7db493d5667b22508e65a34034cefdbcbfa
2018-09-10 16:24:40 +00:00
Arlo Breault 1d687e23f3 Use the dir parameter only from the full definition of a named ref tag
Bug: T196827
Change-Id: Iaf84966e37cea730c9eca07c19a555971ffeadf3
2018-08-22 19:31:23 -04:00
Timo Tijhof e86ffeba3a Deal with <references/> inside a <ref> in automatic references list
The Cite extension already had a recursion guard around the parsing of
`<references/>`, to prevent another `<ref>` containing `<references/>`
from producing a weirdly nested references list.

When an explicit `<references/>` tag is not included in the page, or
`<ref>` tags exist after the last explicit `<references/>`, the extension
automatically adds a reference list at the end of the page, to make the
references still displayed.

This automatic references list creation was bypassing the recursion
guard, causing the weirdly nested output *and* a PHP Notice from
`mRefs[$group]` becoming undefined. This commit sets the recursion guard
state during that automatic references list creation to prevent this.

Bug: T182929
Change-Id: I87737dcf39a4fc15e119a1090a9c34d6b9633c21
2018-07-25 15:39:20 +00:00
Umherirrender 2e4222bd04 Remove reference to archived InlineEditor
Change-Id: Ie549357942a321962e5886c767d36975074f6cb6
2018-05-17 19:24:39 +02:00
Thiemo Kreuz 0fe9dbb366 Don't expect objects by reference in hook handlers
The motivation for this patch is to make the code less complex, better
readable, and less brittle.

Example:

public function onExampleHook( Parser &$parser, array &$result ) {
    /* This is the hook handler */
}

In this example the $result array is meant to be manipulated by the
hook handler. Changes should become visible to the caller. Since PHP
passes arrays by value, the & is needed to make this possible.

But the & is misplaced in pretty much all cases where the parameter is
an object. The only reason we still see these & in many hook handlers
is historical: PHP 4 passed objects by value, which potentially caused
expensive cloning. This was prevented with the &.

Since PHP 5 objects are passed by reference. However, this did not
made the & entirely meaningless. Keeping the & means callees are
allowed to replace passed objects with new ones. The & makes it look
like a function might intentionally replace a passed object, which is
unintended and actually scary in cases like the Parser. Luckily all
Hooks::run I have seen so far ignore unintended out-values. So even if
a hook handler tries to do something bad like replacing the Parser
with a different one, this would not have an effect.

Removing the & does not remove the possibility to manipulate the
object. Changes done to public properties are still visible to the
caller.

Unfortunately these & cannot be removed from the callers as long as
there is a single callee expecting a reference. This patch reduces the
number of such problematic callees.

Change-Id: Ib3a9da257b50326d569ab1973b523c952963c16b
2018-05-17 17:09:55 +00:00
Thiemo Kreuz 8a42f61697 Remove all default "return true" from all hook handlers
This is the default for many years now. Returning true does nothing. It's
identical to returning nothing (null). The only meaningful value a hook
handler can return is false, and even this is meaningful only for very
few hooks.

TL;DR: A "return true" in a hook handler is always meaningless, dead code.

I'm interested in this because we (WMDE) might start working on this
extension soon and I want the code to be small and easy to maintain.

Change-Id: If4f32a55cdc38a3cc8af286d1cca7c0089bbfc43
2018-05-15 10:43:23 +02:00
jenkins-bot cfd18814be Merge "Support directionality for reference" 2018-05-02 15:59:33 +00:00
Eranroz 1ca27aa0d8 Support directionality for reference
Adding option for dir attribute in ref tags. The value must be a valid
direction ('ltr' or 'rtl', case insensitive) or the direction will be
stripped out.

The directionality of the li element is set using a css class accordingly.

Bug: T15673
Change-Id: Iff480bc8cc4f81403b310e8efecd43e29d1d4449
2018-05-02 17:27:32 +02:00
MGChecker 5ca090c67d Clean up backwards-compatibility code
Since Cite requires 1.25+ now, the checks for PPFrame::setVolatile(),
which was introduced in 1.24, can be removed.

Change-Id: I91df2e91b2f7a21b2b1147aa6af194980527f86b
2018-04-18 22:26:54 +02:00
Thiemo Mättig bbc1f2c91d Use standard form for @license tags
See https://spdx.org/licenses/

Change-Id: Ic091ebc3844abcd6de90b3241382fb4732200a6d
2018-03-20 03:18:37 +00:00
Tim Starling db85682b63 Remove failed experiment $wgCiteCacheReferences
This was briefly enabled in WMF production in 2009 and found not to work.
As far as I know, it's been disabled since then. Retaining it requires
maintaining the complex "half-parsed serialization" feature in the core
parser, which I'm deprecating in I838d7ac7f9a218. The core feature was
added solely to support this Cite caching experiment and is not used for
anything else.

Change-Id: I446e0c46913a390dbdf7b49b84040bf47ed6c2f9
2018-02-28 21:04:42 +11:00
Kunal Mehta 1e6ff5c2fc Address PhanUndeclaredClassMethod warning
Don't use the \Database alias, use the namespaced version when calling
Database::getCacheSetOptions.

And document why the remaining issue is suppressed.

Change-Id: I80a102f2e82efedcfa999d8e714bfe049263ffeb
2018-01-03 16:34:01 +00:00
libraryupgrader 1ca3dd57be build: Updating mediawiki/mediawiki-codesniffer to 15.0.0
Change-Id: I11e6d584932dbde52fc5e5d463029270976a47df
2017-12-29 23:28:16 +00:00
Thiemo Mättig 9c8ed18938 Remove some obscure comments
A good bunch of these comments literally repeats what the code already
says.

Change-Id: I9c128f748971bf20a61a85ed57d3261d27c465f0
2017-12-29 12:21:53 -08:00
Phantom42 67ed343ecc Add phan configuration for static analysis
Bug: T179554
Change-Id: I2bfd52c08aac1aa8f34e0664e6314835f79a0324
2017-12-29 11:50:01 -08:00
Umherirrender 817c8a95bd Change typehint from DatabaseBase to IDatabase
Change-Id: I34bde9717d7799406dd9a30e8f9b610da53f374c
2017-12-22 21:28:29 +01:00
Max Semenik 351a08d1b7 Don't break when reference names contain []
Bug: T29694
Bug: T179544
Depends-On: I189bdefbc9034cf8d221a89d7158195de1c0fa6c
Change-Id: Iec3439f76ecc2a3543b30b35f8735c92b0cfb711
2017-11-15 23:23:45 +00:00
Alexander Mashin 3023f55605 T177134: Nulls passed to preg_match in Cite
Line 334 of Cite/includes/Cite.php contains two preg_match () calls. The subject lines for them are produced by Cite::refArgs () and are set to null or false when no name or follow attribute is provided in the <ref></ref> tag.

However, preg_match () is supposed to accept only strings as its subject, and the nowhere in the documentation it is said that it is nullable.

At least, in HHVM 3.12 this causes an exception.

The enclosed patch adds simple checks making sure that preg_match () is not called when $key or $follow are null or false.

Change-Id: I3e00d31d6bf216271ace7e851d88c68c4fd5ed00
2017-10-11 17:01:37 +00:00