Instead of outputting a <ref>'s HTML in both data-mw and in
<references>, output it only in the later and point to it from
data-mw.body.id.
Also preserve data-parsoid for <ref>s text in <references>, as now
that's the only representation of it.
To correctly do html2wt when there are <ref>s inside <references>
we need access to the main document DOM when serializing, so also
ensure that env.page.dom is correctly set (it was only set in v2
before).
Updated tests results and blacklist (some tests now pass).
Change-Id: I0fa7ad692585af19136909bfec39db9868b137c5
Recent core update to gallery display includes srcset support; one of the
parser test cases for Cite uses a gallery and needs updating to match.
Bug: T64709
Change-Id: I6283415e2f7608d9a5c53bc94804fd95a79d3793
* Fixed parser test output of one test to add unique ids. Output for
other parser tests modified in f528f508 still need fixing up.
Left for a separate patch.
Change-Id: I04546c2a590930121d960239a1954b26771e9c80
* <references /> was not appearing on its own line and was
instead getting tacked onto previous line of wikitext output.
* Change in blacklisted wt2wt test shows that the new output
is better.
Change-Id: Ie82401a3bc6082b733339e2456810b6b1c87529a
This patch emits a reflist for all ref groups that still have
<ref>s in them at the end of the document. Currently Cite.php only
does so for the default group. See also T88290.
On html2wt the missing <references> are added to the wikitext,
which makes the wikitext correct. Selser catches this if not part
of the edit.
Change tests to include an explicit <references /> tag, and add
one for explcitly testing that they do get added. This last one
has to be blacklisted as the new <references /> don't appear with
selser.
Change-Id: I79af2c34481cadbf0d68d9571928979adf559b58
From prod error logs:
Undefined index: 0 in Cite_body.php on line 396
Undefined index: 1 in Cite_body.php on line 396
Undefined index: 2 in Cite_body.php on line 396
Undefined index: 3 in Cite_body.php on line 396
Undefined index: follow in Cite_body.php on line 396
Change-Id: Id727f2fd7e72d8c4ceb74fdac42885d5c030b4af
One particular case is that Cite.php considers equal a name and
its encoding, i.e. "a & b" === "a & b". Added a new test for
this case, but blacklisted it on html2wt, wt2wt and html2html due
to a different problem with how Parsoid encodes entities. This
will be investigated separately, as a simple fix could break
unrelated cases.
Also updated tests and blacklist to the new ids.
Change-Id: I87637a1dc812a3a8f29327b9e6c0040b22a651c4
Also encode cite ids properly as now they can contain arbitrary
text. Change in blacklist due to this.
TODO: Investigate if it would be better to do this directly in
the tokenizer.
Change-Id: Ic112124e90d256d73a351d0d57fe3c7546fa065f
* Although this resolves the crashes, I'm unsatisfied with it as a
proper fix to the underlying issue. There are many places throughout
the codebase where we serialize and then parse document fragments
that should be instrumented to store and unpack data-* attributes.
Bug: T76518
Change-Id: Idca1b0a37ec924a71cb51160d000c7de9717d422
The coding conventions suggest avoiding ==,
and for this condition definedness is actually more relevant
than whether the string has any text, but since
the string can also be '0', checking for !$text doesn't work.
Similar to I15b422d3345bf4522e68a17dce9682ff28484559 .
Change-Id: Ib823678b639bf4f1a92dffcd9e41c780b56ab128
The coding conventions suggest avoiding ==,
and for this condition definedness is actually more relevant
than whether the string has any text.
Change-Id: I15b422d3345bf4522e68a17dce9682ff28484559
* Currently, this mimics Cite.php behavior where "a b" and "a_b"
are considered identical ids.
* Added new parser test.
* Fixed output of another test.
* Fixed section name of a commented out test.
Change-Id: I0c51404c3e659bbddfe9a8909aa6a109d368b762
In this function $text can be both false and empty string.
It is more intuitive to use a boolean operator here than
to rely on the fact that comparing to '' using == happens
to give the correct result.
Change-Id: I08248a3fcade7744287e9b9f3bc176d29ac1ecde
* This is both faster and consistent with how we're accessing other
parsoid attributes. It's also a step towards not having this data in
the html output.
* Changes to parserTests and the blacklist are for attribute order.
* Requires upgrading domino to 1.0.18
https://github.com/fgnass/domino/pull/48
Change-Id: I1edbc260887d480adf04763b15043c374e27cceb