* Tokens are now immutable. The progress of transformations is tracked on
chunks instead of tokens. Tokenizer output is cached and can be directly
returned without a need for cloning. Transforms are required to clone or
newly create tokens they are modifying.
* Expansions per chunk are now shared between equivalent frames via a cache
stored on the chunk itself. Equivalence of frames is not yet ideal though,
as right now a hash tree of *unexpanded* arguments is used. This should be
switched to a hash of the fully expanded local parameters instead.
* There is now a vastly improved maybeSyncReturn wrapper for async transforms
that either forwards processing to the iterative transformTokens if the
current transform is still ongoing, or manages a recursive transformation if
needed.
* Parameters for parser functions are now wrapped in abstract Params and
ParserValue objects, which support some handy on-demand *value* expansions.
Keys are always expanded. Parser functions are converted to use these
interfaces, and now properly expand their values in the correct frame.
Making this expansion lazier is certainly possible, but would complicate
transformTokens and other token-handling machinery. Need to investigate if
it would really be worth it. Dead branch elimination is certainly a bigger
win overall.
* Complex recursive asynchronous expansions should now be closer to correct
for both the iterative (transformTokens) and recursive (maybeSyncReturn
after transformTokens has returned) code paths.
* Performance degraded slightly. There are no micro-optimizations done yet
and the shared expansion cache still has a low hit rate. The progress
tracking on chunks is not yet perfect, so there are likely a lot of unneeded
re-expansions that can be easily eliminated. There is also more debug
tracing right now. Obama currently expands in 54 seconds on my laptop.
Change-Id: I4a603f3d3c70ca657ebda9fbb8570269f943d6b6
* less verbose logging in noinclude processing and template expansion
* Give priority to the processing of templates transcluded from transclusions
to get closer to depth-first processing. This serves to minimize memory
usage from queued-up tokens.
* Increase the maximum outstanding requests per template retrieval. 10000
amazingly proved too low a limit on some big pages.
* Only process a single template request callback at a time for now
* Add a debug print in the treebuilder wrapper
* Don't treat multiple comments on a single line as a single comment to match
the PHP parser's behavior
Change-Id: I9a86b6d7bec3b9e1f17415daf1bf74170240721a
* add past paths for empty arguments etc
* cache attribute token transform pipelines
* fix bugs in TokenCollector and NoIncludeOnly handler, and improve its
efficiency by only registering for 'end' tokens on demand
* Remove empty reset methods from a few handlers
* Add a simple 'ap' debug print function that makes it easy to only print some
debug prints by temporarily changing 'dp' to 'ap'
* Improvements and bug fixes in AttributeExpander
Change-Id: Ie69729c8f62d48bba922712e44ebce484c621c50
* Convert isNoInclude logic to positive isInclude throughout and set it
properly on attribute pipelines. Also don't cache non-include pipelines.
* Add a --pagename parameter to parse.js, which sets the page name in the
environment. This is then returned by {{PAGENAME}}. Not the final solution,
but useful for taxobox testing as taxons are selected based on PAGENAME.
* Add rudimentary pagenamebase parser function
Change-Id: If9c0be4c255200d0f2a30f02e5619437b4fd8f12
serialized into a single data-mw-rt attribute if present. Update parserTests
to ignore this attribute for comparisons with expected parser output.
A few more tweaks and notes are thrown into this commit too. 233 tests are
passing now.
improvements to parser functions on the way to support the cite extensions.
Preparation for generic template and template arg in attribute support. 222
parser tests now passing.