We always do our processing in the parser now, so we don't need the
marker comment to detect whether we've already processed the page.
Bonus: include the time taken by our processing in the limit report.
Bug: T291831
Change-Id: Ife7ddffbad1b1495b004739212002a98fdebe6c0
This DOMDocument property has no effect, because we do not use
DOMDocument methods for parsing HTML, but rather DOMUtils::parseHTML()
provided by Parsoid.
Change-Id: I1d9e73e53f2d44f41cf9dcda4f06ac8647671096
Use `DOMCompat::getBody( ... )` as a nicer getter than
`->getElementsByTagName( 'body' )->item( 0 )`.
Remove overly defensive checks and redundant annotations on its
return value. Since we're dealing with HTML documents throughout,
the document body is guaranteed to exist.
We previously needed some of them to convince Phan when it thought
the body may be null, but this seems to no longer be needed.
Change-Id: If7aee7b6adbfa78269c7ba28b26a6eaa21fe935b
These changes ensure that DiscussionTools is independent of DOM
library choice, and will not break if/when Parsoid switches to an
alternate (more standards-compliant) DOM library.
We run `phan` against the Dodo standards-compliant DOM library,
so this ends up flagging uses of non-standard PHP extensions to
the DOM. These will be suppressed for now with a "Nonstandard DOM"
comment that can be grepped for, since they will eventually
will need to be rewritten or worked around.
Most frequent issues:
* Node::nodeValue and Node::textContent and Element::getAttribute()
can return null in a spec-compliant implementation. Add `?? ''` to
make spec-compliant results consistent w/ what PHP returns.
* DOMXPath doesn't accept anything except DOMDocument. These uses
should be replaced with DOMCompat::querySelectorAll() or similar
(which end up using DOMXPath under the covers for DOMDocument any way,
but are implemented more efficiently in a spec-compliant
implementation).
* A couple of times we have code like:
`while ($node->firstChild!==null) { $node = $node->firstChild; }`
and phan's analysis isn't strong enough to determine that $node is still
non-null after the while. This same issue should appear with DOMDocument
but phan doesn't complain for some reason.
One apparently legit issue:
* Node::insertBefore() is once called in a funny way which leans on
the fact that the second option is optional in PHP. This seems to be
a workaround for an ancient PHP bug, and can probably be safely
removed.
Bug: T287611
Bug: T217867
Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
For compatibility with Parsoid's document abstraction (Parsoid may
switch to an alternate DOM library in the future), don't explicitly
create a new document object using `new DOMDocument`; instead use
the Parsoid wrapper `DOMCompat::newDocument()`. This ensures that
the Document object created will be compatible with Parsoid.
There are a number of other subtle dependencies on the PHP `dom`
extension in DiscussionTools, like explicit `instanceof` tests; those
will be tweaked in a follow-up patch
(I3c4f41c3819770f85d68157c9f690d650b7266a3) since they do not affect
correctness so long as Parsoid is aliasing Document to a subclass of
the built-in DOMDocument. Similarly, the Phan warnings we suppress
do not cause runtime errors (because of the fixes included in
c5265341afd9efde6b54ba56dc009aab88eff83c) but phan will be happier
once the follow-up patch lands and aligns all the DOM types.
Bug: T287611
Depends-On: If0671255779571a91d3472a9d90d0f2d69dd1f7d
Change-Id: Ib98bd5b76de7a0d32a29840d1ce04379c72ef486
The regexps needs to be non-greedy, otherwise it could swallow up a
large chunk of the page (up to the next comment).
I noticed this when adding tests for this code, in the 'unclosed-font'
test case (Ief9648b8805fadcc170c54b627eb669cc8b907b6).
Change-Id: I5f67a9599b0cb07bdd53abeebac9ada221181b66
We added it because the initial designs for the subscribe action were
much easier to implement like this, and topic "containers" (T269950)
would have required it.
However, the latest design of the subscribe action will not need it
(T279149), and topic containers are still very far away, so let's
remove it for now.
Bug: T280433
Change-Id: I21a23e9bea43f24d265750926fbd62b99038d3f1
Longer, but follows the style guide and less likely to conflict.
We need to account for init classes in the cache being around for
a while.
Change-Id: I738bc93393850db320fdbda2b003ca8ac40556da
This code expected $container->firstChild to be a
<div class="mw-parser-output">, but that element is not present
when we're running on HTML to be saved in parser cache.
We ended up inserting the marker inside whatever node was the
first on the page, and if it was a <style> element, both our
marker and the styles would be lost when serializing, like in
6c7a0ca9a2.
When we're running on final HTML, the marker will now be outside
of <div class="mw-parser-output">, but that seems to be fine. Only
early versions of I4e60fdbc098c1a74757d6e60fec6bcf8e5db37c1 had
problems with that (see comments on patchset 41), but it works now.
The added test case also covers the fix for T274709.
Bug: T275440
Change-Id: I38d45dd8686919be51e1d307ded12b0afe185eb5
* HookUtils:FEATURES lists all features
* CommentFormatter::USE_WITH_FEATURES are all features
which require the comment formatter
Change-Id: Idbbe8bdd910b9c7b23c7fee76af7bb7ee13c2759
Going forward this will allow us to remove the parser
cache split, and toggle features just using CSS.
The CSS will be modified in a later commit to give the
anon caches time to clear.
Bug: T273072
Change-Id: I83c84b8bc63e1881e07b49acd8499b811adfccd4
As CommentFormatter no longer needs HTMLFormatter, remove
the inheritance and make addReplyLinks a static method.
Testing locally this is marginally slower, going from 2.55s
to 2.9s for the CommentFormatterTest case.
Bug: T266317
Bug: T267973
Change-Id: If69749cae678a1647a138d782a32032189f55cec
Use the same logic for marking ranges in the document, and ensure
that the heading range does not include section edit links or
section numberings.
Change-Id: I782caafc34fee2a822b0a17b24dd6b9528202eca
Previously, only comments could have IDs, because we only needed IDs
for replying. But we might also use them for notifications soon.
Bug: T264478
Change-Id: I1bcad02bf17ab54bc5028a959543c10f0430836b
The output of CommentFormatter::addReplyLinks() and consequently
ThreadItem::jsonSerialize() can end up in the HTTP cache (Varnish) on
Wikimedia wikis. We need to consider that when changing that code.
Introduce a concept of legacy ID (generated by the older algorithm
after it changes), add some placeholder code that will generate them
in the future, and update some code to find comments by either normal
or legacy IDs.
Add dire comments in a bunch of places (as if that ever helps).
Bug: T264478
Change-Id: I4368f366800ab21b8b184b09378037614fdecd33