I played around with a few options (see patchset 1) but ended
introducing new terminology:
* "Backlink" describes the ↑ button down in the list of <references>
that jumps back up into the article. The code was already using
"backlink" in some places.
* "Backlink target" is the id="…" attribute up there, visible as the
typical [1] in the article.
* I use "jump" to describe the idea that clicking the [1] jumps down
to the full reference.
* "Jump target" is the id="…" down there in the list of <references>.
* "Jump link" is the same id, but encoded to be used as the href="…"
attribute when clicking the [1].
I hope this makes sense. Suggestions welcome.
Another benefit is that "normalization" is really only normalization
now, not any URL and/or HTML encoding.
Bug: T298278
Change-Id: I5a64ac43aef895110b61df65b27f683b131886fb
This moves the actual parsing down to be done much later in the
process. This won't make any difference in production but makes it
easier to refactor the code further.
Note I tried to use a StatusValue object but couldn't because it
merges seemingly identical messages, while the plain array is fine
with containing duplicates. There is one parser test that covers
this. While we could change this it needs discussion and most
probably a PM decision.
Change-Id: I7390b688a33dace95753470a927bbe4de43ea03a
The "parser marker" placeholders are case-sensitive, e.g. for a tag
that's written like <rEf> the placeholder will also say …-rEf-…. This
was really just a mistake.
The error is as old as this code is. Added in commit 75004e33 in
2009.
Note we shouldn't use /i at the end because the marker itself should
not be case-insensitive. Only the tag name.
Instead of adding more (slow) test cases I update two that are
exactly about this part of Cite (nested tags) anyway.
Bug: T64335
Change-Id: I44c7a42a0da682a1082952fd1af817bf7d45378c
Two problems:
1. Manipulating globals directly affects all following tests. They
are not independent from each other. This problem can be seen in
CiteTest.
2. Some test cases in testValidateRef don't test what you think.
For example, the test for a conflicting "extends" + "follow" was not
failing because of the conflict but because "extends" was disabled
and disallowed.
Change-Id: Iaa4e1f3f3222155d59984e577cba3f0b8dec40c3
This error message really always meant nothing but "there is an
unknown parameter in your <ref> tag". It's unnecessarily confusing
only for historical reasons. See T299280#9384546 for a long
explanation.
Bug: T299280
Change-Id: Ic224d5828f7b7ac0928c44f526c61654ccf3425e
Note how this currently behaves. The user input is
<ref name="… …">
But what we get in the end is
<li id="… …">
This implies that the is decoded and re-encoded with a
slightly different entity encoding. (Note that and  
and   are all the same character.)
Also note how there is only an underscore in the href="…", but the
non-breaking space is gone. This is identical to what happens in
links and headlines. Try for example [[a _a]]. Multiple
underscores, non-breaking spaces, and normal spaces will be
normalized. We just do the same in the id="…" attributes.
Note this fixes only one of the issues listed in T298278.
Bug: T298278
Change-Id: Ia01f2fdd3b3e9ee6aaa9da60ca3386dcd5d6b1a0
This patch makes only sense together with I5a64ac4 where it is split
from. See I5a64ac4 for details.
The idea is that this patch just re-arranges the code without making
any changes to how the code behaves. This leaves a minimal change
behind that's much easier to revert, if needed.
Bug: T298278
Change-Id: Ie78313b7f3ac1ec7bce5ac7512e60a3bb011480a
This patch does two things:
1. The "normalization" function was never only doing normalization,
but also all the necessary HTML encoding. This is now more visible
and split into two separate functions.
2. To make this easier we change the order slightly. Because of this
the normalization step must now consider spaces. Before spaces have
been converted to underscores by escapeIdForLink.
The results are all the exact same as before.
This is split from I5a64ac4 to make that easier to review.
Bug: T298278
Change-Id: I9435a2ddaa21559e29587c58b7523103141467f7
User-options related classes are being moved to
the MediaWiki\User\Options namespace in MediaWiki Core;
reflect that change here.
Bug: T352284
Depends-On: I42653491c19dde5de99e0661770e2c81df5d7e84
Change-Id: I22ff2effcf9b7f2162f5d57608d8ec3651b48dd7
User-options related classes are being moved to the MediaWiki\User\Options namespace in MediaWiki Core; reflect that change here.
Bug: T352284
Depends-On: I9822eb1553870b876d0b8a927e4e86c27d83bd52
Change-Id: I04337d15d32c1e8bdfdcfa272f1750ffecc8e47e
This uses semantics consistent with the DOM spec (and any future
spec-compliant PHP DOM library, like Dodo or a native one bundled
with PHP). It also helps avoid certain classes of errors caused
by conflating empty-string values and missing attributes.
A number of minor bugs have been fixed in the process, mostly
involving the string value "0" (and similar) which are falsey
and thus were sometimes treated the same as a missing attribute.
The API for MediaStructure was slightly tweaked to conform to
this "return null for missing attribute" convention.
No one outside Parsoid appears to use MediaStructure yet:
https://codesearch.wmcloud.org/search/?q=MediaStructure
And Parsoid itself doesn't usethe MediaStructure::get* methods
which were tweaked here.
This patch squashes (and replaces):
Ib272bd2a9a2da06efc4a4d962cd191107a70f84c
Iae47c9f3c80f1642426ef985f1af6b44a9c807fb
Ic3a43516e2a0c9bbd62f9519e5663b545fd975e7
I7c45552a21ba49c76b1fc56f023d7937d9f41340
Ic231a53dba9527c6b811cba990ae35f7fc7bf153
I5209338d221ebf738a505d85fe1a019ea621fcc5
Icb6b13c2b49edb3cd725999f09f8cb7e3a4c0f99
I7eca3e616a66cac5b2a792435889cced2e2c9cd9
Change-Id: I11e7efb546c8cf1aac6b49c3361aacbd4eb5cef1