This avoids a duplicate parse with DiscussionTools (T376325) and also
reduces some redundancy by using the metrics-gathering code from
ParserOutput instead of having to clone it here. Finally, it allows
the parse to use the output of a previous parse for selective
update.
Bug: T376325
Follows-Up: I64a4556a74da4f735a5b562070c21310ecda36d1
Change-Id: I11386e307caaa9fce34870b08bd4dce4c5e6eb25
This ensures that all parsoid parses are accounted for in our
statistics. In the future we might want to query the cache for
an existing 'dirty' parse in this codepath to potentially allow
for selective update, but for now assume that selective updates
are not possible here.
Bug: T371713
Depends-On: I5b8c7ab48d5a1d6c1e311149fcac6abdc523aa13
Change-Id: I391e928175f60a1ff2e5c181e20ed72efe4dfd66
We don't need to directly handle the ParsoidParserFactory in the
LintUpdate job; use the existing ContentHandler pathways to reduce
dependencies.
Change-Id: I64a4556a74da4f735a5b562070c21310ecda36d1
I015fbe2ab613619c8805d12bfd397cc08450ef24 falsely assumed these were
ending up in logstash already but, unless explicitly asked, logstash
doesn't go below "info", regardless of the channel's level.
Change-Id: I55884a2535e839ca92d5d679cc4dc7911050f298
When RESTBase is turned off, Parsoid runs will no longer be triggered
on template changes. This creates a new mechanism to do that, based on
the RevisionDataUpdates hook called by DerivedPageDataUpdater. The new
behavior is controlled by a feature flag, LinterParseOnDerivedDataUpdate,
which is enabled per default. In WMF production, this should be
turned off as long as we are still triggering Parsoid parses through
the pregeneration mechanism in RESTBase.
Note that this will not write ParserOutput to the ParserCache. On edits,
pages will get parsed with Parsoid twice, once to trigger the lint data
update, and once by ParsoidCachePrewarmJob to populate the ParserCache.
Both parses will trigger the ParserLogLinterData hook, the lint data
from the second parse is redundant.
However, while ParsoidCachePrewarmJob and RevisionDataUpdates get
triggered together on edits, they also get triggered separately:
ParsoidCachePrewarmJob by page views with parser cache misses; and
RevisionDataUpdates when pages get invalidated due to template changes.
Because ParsoidCachePrewarmJob and RevisionDataUpdates generally get
triggered in different situations, it seems cleaner to keep the two
mechanisms independent of each other, and live with the duplicate parse
on edit.
Bug: T361013
Change-Id: If53841ee583ce240dd245d640b9ea9c97e1eaa55