Now the brackets can be hidden with `display: none;` or made
invisible but appear in copy-pasted content with `font-size: 0;`.
Bug: T370512
Change-Id: I21accb6bf9cdd7f96144cc9a762ff889e48efca4
We want it to be possible to turn a <ref> into an extended one after
it was re-used for the first time, not knowing if it later turns out
to be an extended ref.
This should work: <ref name=x/> is a short re-use of a ref that later
turns out to be a <ref extends=… name=x>…</ref>.
<ref name=x></ref> is just another syntax that should behave
identical.
However, it should probably not be possible to turn
<ref name=x>foo</ref> into a subref later because it really, really
looks like a normal ref. Even if the content matches with a later
<ref extends=… name=x>foo</ref> and we usually ignore identical
content, I suggest to block this with a dedicated error message. But
this is for a later patch. This patch here just documents the status
quo.
This patch also contains minor code cleanups that will be useful in
Ia752a7d.
Bug: T367749
Change-Id: Ie38769b36e5c476b96e7af7f03b0fc800b32ba97
This patch adds 'mw-cite-backlink' to the linkback span for both
named and unnamed refs. This requires us to add a span wrapper
for the unnamed refs case.
Verified in local testing that this causes aria attributes to be
added to the linkback tags in Parsoid HTML.
This should likely fix other gadgets and code that rely on this
class name to do their work.
Strictly speaking, this is a breaking change since we add an
extra span wrapper for the unnamed ref backlinks which *could*
break anyone using a li > a[rel="mw:referencedBy"] selector.
But, given the specificity of the a[rel] selector, the "li >"
part is unnecessary and might not be used. So, if we wanted to
push our luck (and break process), we could get this in.
Alternatively, we could:
- do this in the the read views OutputTransformPipeline.
- do a real major version bump -- we would be exercising that
functionality and have to fix and implement any missing pieces
that may have broke as part of the RESTBase sunsetting.
- not add the span wrapper and fix gadgets to explicitly look for
both named and unnamed refs with their selectors.
Bug: T328695
Change-Id: Icbd325ebd12cb42186c5b5220dc016835eb18b64
This adds the 'reference-text' class where Parsoid added
'mw-reference-text'.
If we don't care about the "mw-" prefix, since there are very
few wiki and code references to 'mw-reference-text', it might
seem like we could update all those references and rip out
'mw-reference-text' from Parsoid output.
But, Parsoid HTML is also exposed via the REST API which means
there are likely many users out there analyzing Parsoid HTML.
https://github.com/search?q=%22mw-reference-text%22+NOT+language%3AHTML&type=code
says there are 512 references to this string - so looks like
we are probably going to rely on a major HTML version bump in
Parsoid in the future and then rip out all the duplicate
classes (mw-ref, mw-references, mw-reference-text OR
reference, references, reference-text).
Bug: T328695
Change-Id: I04b18ac75863a0e3e61bdd47b34508e5547dc872
There is currently no test coverage for recursively parsing the
contents of a <ref>…</ref> together with an incomplete follow="…".
This is critical because that's an entirely separate, special code
path (the one that creates a <p> instead of an <li>). Without this
test we could return unparsed wikitext and never notice.
I discovered this while playing with I0b0e358.
Bug: T245549
Change-Id: Ie65c6bf6bf75db26e0fff733c93cfa28ee7bd228
The message was part of the original patch that introduced the group
feature in 2009, see https://phabricator.wikimedia.org/rECIT75004e33.
Notice how there was never a test scenario for this message. A test
was added in 2020 via I07738cc.
The message appears only in a rare edge-case when a group is entirely
unused in the text, and only when the group is not empty. The shortest
possible example is:
<references group=g>
<ref group=g name=a>a</ref>
</references>
Just adding something unrelated like `<ref group=g>x</ref>` to the
text changes the error message. Now the group is "used". But this
notion is confusing to begin with. References can be part of a group,
and we can use references, but we can't use groups as if they are a
separate entity.
A better error message already exists.
Notice how this special error message doesn't appear anywhere in the
Parsoid code path. That was already using the other, more generic
error message.
Bug: T269531
Change-Id: I63f663d76e45e6c3d664f145d9a564ee00ff53cd
This is about the error message that currently says:
»Cite error: <ref> tag with name "a" defined in <references> has
group attribute "" which does not appear in prior text.«
This is a special error message that appears only when a group name
does not appear anywhere in the text. In all other cases a simpler
error message is shown:
»Cite error: <ref> tag with name "a" defined in <references> is
not used in prior text.«
While the first error message is not wrong in the edge-case
described in T269531, it's very confusing for a multitude of
reasons. For example:
* There is no group attribute in the wikitext.
* Just adding something completely unrelated like `<ref>x</ref>` to
the text shows the other error message.
The reason for this behavior is that the assumed default is an empty
`group=""`. The error message changes the moment any other <ref> in
the same group appears in the text vs. when the group is entirely
unused.
We can probably remove this error message entirely, but should at
least not use it when there is no group.
Notice how the Parsoid code path was already using the other error
message.
Bug: T269531
Change-Id: Ifa2e97254f4cda72233a057d8760fb1116143552
* This is to worka round some confusing html2html failures in CI
for these tests (that are not reproducible locally for me).
Change-Id: I07725155ef5e04eb4346a90c34cbacbd70e88ea6
Intentionally no other change is made (yet). This is for a later,
separate patch.
Intentionally not touching the huge list of per-language
ext.cite.style.*.css files for the moment. Again, I would prefer to
do this in a separate patch.
Change-Id: I4e392c7bd1c69849a6c7946676a64c749ddbcd60
This commit also moves certain parser tests involving <ref> from
the Parsoid repo to citeParserTests.txt in this repo.
Bug: T354215
Change-Id: Ie5b211d2af01a56684473723c68a9ab2775542e3
Some interesting stuff is happening, seems to have revealed bugs:
* Rolled-back warnings are still present on the ref
* Subref reuse numbering starts at 0 instead of 1, and formatting is cringe.
But subref rollback does seem to work!
Change-Id: If6321b34d27370553ba85e63dd1e2ae6a3b7c099
In this case, there was never a ref with this name in the article so
no backlinks should be rendered.
TODO:
* test case with empty parent backlink and LDR parent
Bug: T353451
Change-Id: I8a7abd05a48ce83da3beb92b15e894d53252bd33
This is another improvement after I7390b68. Status objects are made
to keep track of multiple errors. The only difference is: The merge
method skips duplicates when the message and all parameters are
identical. This causes a minor user-facing change. One of the
shortest possible examples is:
<references>
<ref />
<ref />
</references>
This showed two identical, indistinguishable error messages before,
but will only show one now. We argue this is fine. The duplicates
are confusing and of (almost) no value to the user. In case the
information is relevant the correct solution is to make the error
messages distinguishable, or introduce a message like "multiple
<ref> tags defined in <references> have the same error". This is
something for a later patch, if needed.
Bug: T353266
Change-Id: I444105462ed24d5ba37b057622b4dc847b40f8d8
This fixes a minor issue introduced in I294b59f. Two identical
dir="…" with different capitalizations should not be reported as an
error.
Turns out the implementation in the Cite extension doesn't care
about this capitalization at all. That's why I suggest to do the
normalization as early as possible. This is slightly different in
the Parsoid implementation.
Bug: T202593
Change-Id: I96b4a281d6020d61d1f36ec027cf833bbb244f03
* Since Cite development happens in two repos (here and the Parsoid
repo), integrated tests ensures that changes don't fall too far
out of sync.
CI runs Parsoid-integrated-mode tests in extensions repo with the
vendor-released Parsoid.
Parsoid CI runs Parsoid-standalone-mode tests in the Parsoid repo
which also has a copy of the citeParserTests.txt file found here.
But, that CI run uses the Parsoid patch itself.
This difference makes for unnecessrily laborious test syncing
while making changes to the two repos. It is manageable for one-off
changes but when making lots of updates that changes tests a lot,
this quickly becomes painful.
* For now, we can break this coupling temporarily by disabling
Parsoid-integrated-mode test runs. This simplifies the test syncing
by letting patches in Cite repo to be merged in a chain and then
doing a single test sync to the Parsoid repo (otherwise, Parsoid's
CI will be broken since the html/php sections in Parsoid's cite
test copy will be out of date).
* Filed T354215 to move Parsoid's Cite implemntation to this repo
which eliminates this complexity altogether.
Change-Id: Id5727381b0e23058d098180c308797b2555ad02f
This classifies as a "warning" because we still show everything,
just with an error message appended.
Disabling the Parsoid tests right away hopefully makes it easier to
do the same change in Parsoid.
Bug: T202593
Depends-On: If14acd1070617ca8c4d15be6b1759bd47ead4926
Change-Id: I294b59f989f553932b40d08308906dd72d92d2cd
* This now aligns with Parsoid commit 0fab92ba453d424aedeadaaa9e1514c42bbd94d1
* Disabled the newly added tests because that Parsoid fixes for the
tests haven't been released to vendor to let CI pass these tests.
* Re-enabled a previously disabled test.
Change-Id: I4ab87d2d486b7a1fef652c50c4f1e79ddfe83ce6
This reverts commit b163add15b.
Reason for revert: This was my mistake. I forgot that reverting this
would break Parsoid CI once the Parsoid Cite patch merged. So, I have to
wait till the Parsoid Cite change is released to vendor before I sync
the test change here.
Change-Id: Icaecee1e56907980681aae01be377b6906bd93a6
I played around with a few options (see patchset 1) but ended
introducing new terminology:
* "Backlink" describes the ↑ button down in the list of <references>
that jumps back up into the article. The code was already using
"backlink" in some places.
* "Backlink target" is the id="…" attribute up there, visible as the
typical [1] in the article.
* I use "jump" to describe the idea that clicking the [1] jumps down
to the full reference.
* "Jump target" is the id="…" down there in the list of <references>.
* "Jump link" is the same id, but encoded to be used as the href="…"
attribute when clicking the [1].
I hope this makes sense. Suggestions welcome.
Another benefit is that "normalization" is really only normalization
now, not any URL and/or HTML encoding.
Bug: T298278
Change-Id: I5a64ac43aef895110b61df65b27f683b131886fb
The "parser marker" placeholders are case-sensitive, e.g. for a tag
that's written like <rEf> the placeholder will also say …-rEf-…. This
was really just a mistake.
The error is as old as this code is. Added in commit 75004e33 in
2009.
Note we shouldn't use /i at the end because the marker itself should
not be case-insensitive. Only the tag name.
Instead of adding more (slow) test cases I update two that are
exactly about this part of Cite (nested tags) anyway.
Bug: T64335
Change-Id: I44c7a42a0da682a1082952fd1af817bf7d45378c
This error message really always meant nothing but "there is an
unknown parameter in your <ref> tag". It's unnecessarily confusing
only for historical reasons. See T299280#9384546 for a long
explanation.
Bug: T299280
Change-Id: Ic224d5828f7b7ac0928c44f526c61654ccf3425e
Note how this currently behaves. The user input is
<ref name="… …">
But what we get in the end is
<li id="… …">
This implies that the is decoded and re-encoded with a
slightly different entity encoding. (Note that and  
and   are all the same character.)
Also note how there is only an underscore in the href="…", but the
non-breaking space is gone. This is identical to what happens in
links and headlines. Try for example [[a _a]]. Multiple
underscores, non-breaking spaces, and normal spaces will be
normalized. We just do the same in the id="…" attributes.
Note this fixes only one of the issues listed in T298278.
Bug: T298278
Change-Id: Ia01f2fdd3b3e9ee6aaa9da60ca3386dcd5d6b1a0
This parser test is a bit obscure, in my opinion. We added it in
I8c4de96 to make sure we don't get thousand separators in most
places.
We continued reworking the code since then. By now it's effectively
impossible to "accidentally" get thousand separators. The
problematic methods from the Language class are not even accessible
any more from this code.
To make the tests more robust we now use createNoOpMock (done via
the previous patch) where it matters, specifically for all Language
and Parser mocks. This proves the problematic Language methods are
never called.
Bug: T253743
Bug: T238187
Change-Id: I9bfe1f4decfaf699996da63e19473c2c0d581d9d
html/php sections are added since otherwise it complains that the
"Test lacks html or metadata section on lines"
Change-Id: Ib1c47be09bdbe1e84b595373ad71772f2a983fc9