* Process <ref> and <references> tag on the top-level DOM only
and ignore the generateRefs pass when processing other content.
* This required a few fixes:
- ensure that DOMPostProcessor knows about the top-level.
- ensure that DOMVisitor knows about the top-level.
- cleanup pass leaves behind the ref-marker metas from DOMs from
non top-level content.
- process nested references content.
* One of the references tests had incorrect parsed output. That test
has been updated to reflect the correct output from this patch.
* Barack Obama seems to now have the correct numbering on references.
Change-Id: I5465721d2fc715f2168f267e773a446bc37d198b
* Keep track of table nesting in token stream patcher and use it to
convert <td>, <tr>, and <th> tags to plain strings.
* This fix is only enabled on the top-level token stream.
To support this, fixed the resetState function in the parser
construction code to pass in a toplevel flag which lets the
token stream patcher know the context it is in.
* Fixes 29 (wt2html,wt2wt,html2html,selser) tests and improves
results of 1 previously blacklisted tests. The failing selser
test is actually a false failure because selser is more accurate
than non-selser wts.
* Consolidated a few separate tests into a single test that covers
all this functionality.
- This new test fails wt2wt and html2wt modes because serializer
uses tokenizer information which continues to return table tokens
and results in <nowiki> wrappers.
Bug: 66489
Bug: 66498
Change-Id: I9f42354ea9efb0f8adfc96c23760012220d00dd4
The failure looks like it was caused by
Ib7aa9449bbd994cb23b83b3f23cff944b1cddadf in core. Its not a regression,
just different looking equally valid results.
Change-Id: Icfe02eaf7f2a2d8273e973442e04006b6024684d
We've had Parser::recursiveTagParse since MediaWiki 1.8, back in 2006.
Remove code that only gets used if it's not available.
Change-Id: I76eed5570a675a14cf70ab10981661e0bc8bda99
The Cite extension does not currently handle resetState calls in
sub-pipelines, and relies on sharing a single Cite instance between all
pipelines. Fixing this is a longer project, so this patch works around the
issue for now by passing a flag indicating resetState calls in sub-pipelines
and ignoring the call in Cite in that case.
Change-Id: If3d426a5311a55d1c1530860d2b665d3681f1aa9
A performance issue was fixed in the shim(s) generated by
generateJsonI18n.php, so it needed to be updated.
Change-Id: I1f0ddf131ded163fa38afdf95fd92ce8c71f22b2
* Entities in ref name weren't expected
* Fixes the crash from arwiki:تأثير_الدمعة_السوداء
* Makes use of the fix from de3642b8dd4a804ac654f2943a900496f2c8b3f3
Bug: 63790
Change-Id: Icb8781b4d9decc5a8b115d0b11def4d18f5d5025
* Thus far, <references> tag content was being parsed to
stage 2 and merged into main pipeline. This patch takes
this all the way to DOM. This required some tweaks to
handling of <ref>s nested inside <references>.
* Fixed up a buggy parser test in the bargain -- the old parsoid
result was buggy as well. I verified output in the enwp
sandbox.
Change-Id: Iff6c528066b71ce1b00dd769910a04ee66623340