These changes ensure that DiscussionTools is independent of DOM
library choice, and will not break if/when Parsoid switches to an
alternate (more standards-compliant) DOM library.
We run `phan` against the Dodo standards-compliant DOM library,
so this ends up flagging uses of non-standard PHP extensions to
the DOM. These will be suppressed for now with a "Nonstandard DOM"
comment that can be grepped for, since they will eventually
will need to be rewritten or worked around.
Most frequent issues:
* Node::nodeValue and Node::textContent and Element::getAttribute()
can return null in a spec-compliant implementation. Add `?? ''` to
make spec-compliant results consistent w/ what PHP returns.
* DOMXPath doesn't accept anything except DOMDocument. These uses
should be replaced with DOMCompat::querySelectorAll() or similar
(which end up using DOMXPath under the covers for DOMDocument any way,
but are implemented more efficiently in a spec-compliant
implementation).
* A couple of times we have code like:
`while ($node->firstChild!==null) { $node = $node->firstChild; }`
and phan's analysis isn't strong enough to determine that $node is still
non-null after the while. This same issue should appear with DOMDocument
but phan doesn't complain for some reason.
One apparently legit issue:
* Node::insertBefore() is once called in a funny way which leans on
the fact that the second option is optional in PHP. This seems to be
a workaround for an ancient PHP bug, and can probably be safely
removed.
Bug: T287611
Bug: T217867
Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
A TreeWalker ends up walking potentially every single subsequent
node in the document looking for a target node. Instead use upwards
traversal to find a common ancestor, then sibling traversal to
compare document order.
This makes calling cloneContents on every comment on a 300k talk page
significantly faster, going from >30s to 500ms locally.
Change-Id: I28a2b8c11d4098d9bc44d19b98e19ccc02273098
If A follows B, then we can assume that B does not follow A.
Calling the function recursively computes that twice,
we can instead make some simple changes to "invert" the result.
Change-Id: I709aca7cb997dd2fe3980468a8c6bde6f366fb5b
It's an expensive method, and we previously called it for
every child of the common ancestor, completely unnecessarily.
These changes follow from two observations:
* If there is a $firstPartiallyContainedChild, then the
first fully contained child must follow it; similarly,
if there is a $lastPartiallyContainedChild, then the
last fully contained child must precede it.
* All nodes between the first and last fully contained
children are also fully contained.
Maybe it can be made cleverer still, but it's a lot better.
Change-Id: I4e596c62274c2c0be115f0ddec42629115b430a4
We can check whether a node is a child of another node directly,
without iterating over all its children.
Change-Id: I3a26df89365bf765348d96b477c983ec9c4e43fe