mediawiki-extensions-Cite/lib
Subramanya Sastry 6c503e8973 (Bug 64901) Fix paragraph-wrapping to match PHP parser + Tidy combo
General changes
---------------
* Replaced the hacky 'inBlockNode' parser pipeline option with
  a cleaner 'noPWrapping' option that suppresses paragraph wrapping
  in sub-pipelines (ex: recursive link content, ref tags, attribute
  content, etc.).

Changes to wt2html pipeline
---------------------------
* Fixed paragraph-wrapping code to ensure that there are no bare
  text nodes left behind, but without removing the line-based block-tag
  influences on p-wrapping. Some simplifications as well.

  TODO: There are still some discrepancies around <blockquote>
  p-wrapping behavior. These will be investigated and addressed
  in a future patch.

* Fixed foster parenting code to ensure that fostered content is
  added in p-tags where necessary rather than span-tags.

Changes to html2wt/selser pipeline
----------------------------------
* Fixed DOMDiff to tag mw:DiffMarker nodes with a is-block-node
  attribute when the deleted node is a block node. This is used
  during selective serialization to discard original separators
  between adjacent p-nodes if either of their neighbors is a
  deleted block node.

* Fixed serialization to account for changes to p-wrapping.
  - Updated tag handlers for the <p> tag.
  - Updated separator handling to deal with deleted block tags
    and their influence on separators around adjacent p-tags.
  - Updated selser output code to test whether a deleted block
    tag forces nowiki escaping on unedited content from adjacent
    p-tags.

Changes to parser tests / test setup
------------------------------------
* Tweaked selser test generation to ensure that text nodes are always
  inserted in p-wrappers where necessary.

* Updated parser test output for several tests to introduce p-tags
  instead of span-tags or missing p-tags, add html/parsoid section,
  or in one case, add missing HTML output.

Parser Test Result changes
--------------------------
Newly passing
- 12 wt2html
- 1 wt2wt
- 3 html2html
- 3 html2wt

Newly failing
- 1 html2wt

  "3. Leading whitespace in indent-pre suppressing contexts should not be escaped"
  This is just normalization of output where multiple HTML forms
  serialize to the same wikitext with a newline difference. It is not
  worth the complexity to fix this.

- 1 wt2wt
  ""Trailing newlines in a deep dom-subtree that ends a wikitext line"
  This is again normalization during serialization where an extra
  unnecessary newline is introduced.

- A bunch of selser test changes.
  182 +add, 188 -add => 6 fewer selser failures

  - That is a lot of changes to sift through, and I didn't look at every
    one of those, but a number of changes seem to be harmless, and just
    a change to previously "failing" tests.

  - "Media link with nasty text" test seems to have a lot of selser
    changes, but the HTML generated by Parsoid seems to be "buggy" with
    interesting DSR values as well. That needs investigation separately.

  - "HTML nested bullet list, closed tags (bug 5497) [[3,3,4,[0,1,4],3]]"
    has seen a degradation where a dirty diff got introduced.
    Haven't investigated carefully why that is so.

Change-Id: Ia9c9950717120fbcd03abfe4e09168e787669ac4
2014-09-16 11:59:55 -05:00
..
ext.Cite.js (Bug 64901) Fix paragraph-wrapping to match PHP parser + Tidy combo 2014-09-16 11:59:55 -05:00