mediawiki-extensions-Cite/src/Parsoid/RefProcessor.php
Arlo Breault 2f09cdb732 One document to rule them all
The description in T179082 suggests that by using one document for the
entire parse, we'd probably see some performance gains from not having
to import nodes when we get to the top level pipeline and we'd avoid the
validation errors from 19a9c3c.

However, the spec seems to suggest creating a new document when parsing
an HTML fragment,
https://html.spec.whatwg.org/#html-fragment-parsing-algorithm

And, indeed, domino implements it that way,
12a5f67136/lib/htmlelts.js (L84-L96)

So, the request in T217705 may be a little misguided.

What then is this patch good for?  In T221790 the ask is that
sub-pipelines produce DocumentFragment which make for cleaner interfaces
and less confusion when migrating children.

The general outline here is that a document is created when the
environment is constructed that gives us the 1-1 correspondence.
Sub-pipelines do create their own documents for the purpose of tree
building, as in the fragment parsing algorithm, but are then immediately
imported to DocumentFragments to be used for the rest of the
post-processing passes.

Bug: T221790
Bug: T179082
Bug: T217705
Change-Id: Idf856d4e071d742ca38486c8ab402e39b3c8949f
2020-09-29 22:36:33 +00:00

30 lines
672 B
PHP

<?php
declare( strict_types = 1 );
namespace Wikimedia\Parsoid\Ext\Cite;
use DOMNode;
use Wikimedia\Parsoid\Ext\DOMProcessor;
use Wikimedia\Parsoid\Ext\ParsoidExtensionAPI;
/**
* wt -> html DOM PostProcessor
*/
class RefProcessor extends DOMProcessor {
/**
* @inheritDoc
*/
public function wtPostprocess(
ParsoidExtensionAPI $extApi, DOMNode $node, array $options, bool $atTopLevel
): void {
if ( $atTopLevel ) {
$refsData = new ReferencesData();
References::processRefs( $extApi, $refsData, $node );
References::insertMissingReferencesIntoDOM( $extApi, $refsData, $node );
}
}
// FIXME: should implement an htmlPreprocess method as well.
}