wikimedia/mediawiki-extensions-VisualEditor - fanwikis.org Git Server

wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-09-27 20:26:46 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	d918fa18ac	Big token transform framework overhaul part 2 * Tokens are now immutable. The progress of transformations is tracked on chunks instead of tokens. Tokenizer output is cached and can be directly returned without a need for cloning. Transforms are required to clone or newly create tokens they are modifying. * Expansions per chunk are now shared between equivalent frames via a cache stored on the chunk itself. Equivalence of frames is not yet ideal though, as right now a hash tree of unexpanded arguments is used. This should be switched to a hash of the fully expanded local parameters instead. * There is now a vastly improved maybeSyncReturn wrapper for async transforms that either forwards processing to the iterative transformTokens if the current transform is still ongoing, or manages a recursive transformation if needed. * Parameters for parser functions are now wrapped in abstract Params and ParserValue objects, which support some handy on-demand value expansions. Keys are always expanded. Parser functions are converted to use these interfaces, and now properly expand their values in the correct frame. Making this expansion lazier is certainly possible, but would complicate transformTokens and other token-handling machinery. Need to investigate if it would really be worth it. Dead branch elimination is certainly a bigger win overall. * Complex recursive asynchronous expansions should now be closer to correct for both the iterative (transformTokens) and recursive (maybeSyncReturn after transformTokens has returned) code paths. * Performance degraded slightly. There are no micro-optimizations done yet and the shared expansion cache still has a low hit rate. The progress tracking on chunks is not yet perfect, so there are likely a lot of unneeded re-expansions that can be easily eliminated. There is also more debug tracing right now. Obama currently expands in 54 seconds on my laptop. Change-Id: I4a603f3d3c70ca657ebda9fbb8570269f943d6b6	2012-05-15 17:05:47 +02:00
Adam Wight	0a7f0b7630	List markup is created during the sync23 phase. This makes it possible to transclude list items from a template. Note: "5 quotes" test is broken by this patch, it appears that ListHandler newline processing is changing some state which mysteriously affects the QuoteTransformer. This is ominous, hopefully there's a simple explanation... gwicke: fix a bug in tokenizer triggered by definition lists like this: **; foo : bar Change-Id: I4e3a86596fe9bffcbfc4bf22895362c3bf742bad	2012-05-08 11:39:36 +02:00
Gabriel Wicke	6e21f6bb27	Forward-port Cite extension * Adapted Cite extension to use current interfaces and token formats * Improved TokenCollector Change-Id: I20419b19edd9bbad2c2abf17a2ff1411b99c0c04	2012-05-03 13:22:01 +02:00
Gabriel Wicke	027d77e0c9	Fix --wikidom and --linearmodel parse.js options; retry on template fetch failures Change-Id: I444397936fd87971fe085df4b467089367e9ffa6	2012-04-26 19:51:00 +02:00
Gabriel Wicke	8ff810659a	Rename text/wiki and tokens/wiki to text/x-mediawiki and similar Change-Id: I70113629f4633685cd6db3914303a15e4c79a50a	2012-04-25 20:19:43 +02:00
Gabriel Wicke	814511f523	Remove dead parser pipeline code Change-Id: I802f1798d5163c1ce82d648f739c2e79b17eda41	2012-04-25 17:12:32 +02:00
Gabriel Wicke	8368e17d6a	Biggish token transform system refactoring * All parser pipelines including tokenizer and DOM stuff are now constructed from a 'recipe' data structure in a ParserPipelineFactory. * All sub-pipelines of these can now be cached * Event registrations to a pipeline are directly forwarded to the last pipeline member to save relatively expensive event forwarding. * Some APIs for on-demand expansion / format conversion of parameters from parser functions are added: param.to('tokens/expanded', cb) param.to('text/wiki', cb) (this does not work yet) All parameters are additionally wrapped into a Param object that provides method for positional parameter naming (.named() or conversion to a dict (.dict()). * The async token transform manager is now separated from a frame object, with the frame holding arguments, an on-demand expansion method and loop checks. * Only keys of template parameters are now expanded. Parser functions or template arguments trigger an expansion on-demand. This (unsurprisingly) makes a big performance difference with typical switch-heavy template systems. * Return values from async transforms are no longer used in favor of plain callbacks. This saves the complication of having to maintain two code paths. A trick in transformTokens still avoids the construction of unneeded TokenAccumulators. * The results of template expansions are no longer buffered. * 301 parser tests are passing Known issues: * Cosmetic cleanup remains to do * Some parser functions do not support async expansions yet, and need to be modified. Change-Id: I1a7690baffbe8141cadf67270904a1b2e1df879a	2012-04-25 16:51:36 +02:00
Gabriel Wicke	bf84638bc0	Add tokenizer cache and clone token state on mutation * Added an LRU cache (using the lru-cache node module) for tokenizer output * Mutation of nested attributes now replaces the containers. A shallow copy of tokens is sufficient to isolate token transformations. Need to investigate if we can actually get away without isolation and re-transformation for most ordinary tokens. Change-Id: I9136b1d7a1fbcc538183a319d4ecaa290d616fdf	2012-04-18 14:40:47 +02:00
Gabriel Wicke	c688b039de	Collected tweaks * less verbose logging in noinclude processing and template expansion * Give priority to the processing of templates transcluded from transclusions to get closer to depth-first processing. This serves to minimize memory usage from queued-up tokens. * Increase the maximum outstanding requests per template retrieval. 10000 amazingly proved too low a limit on some big pages. * Only process a single template request callback at a time for now * Add a debug print in the treebuilder wrapper * Don't treat multiple comments on a single line as a single comment to match the PHP parser's behavior Change-Id: I9a86b6d7bec3b9e1f17415daf1bf74170240721a	2012-04-16 15:47:03 +02:00
Gabriel Wicke	5bb2d96869	Token stream transform improvements * add past paths for empty arguments etc * cache attribute token transform pipelines * fix bugs in TokenCollector and NoIncludeOnly handler, and improve its efficiency by only registering for 'end' tokens on demand * Remove empty reset methods from a few handlers * Add a simple 'ap' debug print function that makes it easy to only print some debug prints by temporarily changing 'dp' to 'ap' * Improvements and bug fixes in AttributeExpander Change-Id: Ie69729c8f62d48bba922712e44ebce484c621c50	2012-04-12 15:42:09 +02:00
Gabriel Wicke	3124deca2c	Track inclusion status on CachedTokenPipeline Non-include attribute pipelines are not cached for now. Adding separate caching for non-include attribute pipelines is very likely worth it, but deferred for now. Change-Id: I13f949d9f0a04536f9ccfcb73a2be69c5c08be01	2012-04-12 10:21:50 +02:00
Gabriel Wicke	efa41370d3	Set inclusion flag for attribute transform managers too Change-Id: Ice15d8fde6de4a3e850a028db9917e976218fc43	2012-04-11 21:55:52 +02:00
Gabriel Wicke	9ae572cca0	Fixes to template expansion / token transform managers, 296 tests passing. * Convert isNoInclude logic to positive isInclude throughout and set it properly on attribute pipelines. Also don't cache non-include pipelines. * Add a --pagename parameter to parse.js, which sets the page name in the environment. This is then returned by {{PAGENAME}}. Not the final solution, but useful for taxobox testing as taxons are selected based on PAGENAME. * Add rudimentary pagenamebase parser function Change-Id: If9c0be4c255200d0f2a30f02e5619437b4fd8f12	2012-04-11 16:34:27 +02:00
Adam Wight	a85ed36efa	"magic words" are tokenized and used to set parser.environment flags behavior switches are converted to tokens which set parser.environment flags during the async transformation stage. The next step would be for handlers in the sync23 stage to generate the TOC, section edit links, and so on according to these directives. No tests written, because the switches are consumed and don't appear in rendered html. We can test the magic word layout controls individually, once they're implemented. Another small change was to store option flags directly in the environment object, not that it makes much difference. Change-Id: I863fbf4be1a17d2f6c31158298dd301f19ae1137	2012-04-04 11:25:29 -07:00
Catrope	8dc994f037	Add HTML DOM -> linear model converter Also, in ParserPipeline: * Import the LM converter and expose it through getLinearModel() * Fix getWikiDom() to actually work (still unused) In parse.js: * Add --help option that prints usage information (was unreachable) * Add --linearmodel option to output linear model JSON instead of HTML Change-Id: Ic534e03ff40a7c9117bb63f0c635a4213d5e3406	2012-03-29 12:47:14 -07:00
Gabriel Wicke	f157093a41	Delegate responsibility for resetting the token rank to transforms, if full re-processing in a phase is wanted. By default, after a token type change or the return of multiple tokens only the remaining transforms with higher ranks are applied. Updated a few comments as well.	2012-03-07 19:29:53 +00:00
Gabriel Wicke	1f8c43b9e2	A few minor documentation updates.	2012-03-07 18:42:26 +00:00
Gabriel Wicke	af03eb4f29	Improve generic attribute expansion before external link processing, and make wgUploadPath configurable. Also change the hard-coded fall-back image sizes to sensible defaults. This breaks three parser tests until image size retrieval from the wiki is implemented.	2012-03-06 18:02:35 +00:00
Gabriel Wicke	7f7202e89c	A few improvements to external link and image handling. 264 tests passing.	2012-03-05 15:34:27 +00:00
Gabriel Wicke	4b9bd45b82	Start to move wikilink expansion to a separate async token transformer.	2012-02-29 13:56:29 +00:00
Gabriel Wicke	b8bb503199	Actually commit onlyinclude, as already announced in r112592.	2012-02-28 13:24:35 +00:00
Gabriel Wicke	491ad5ffef	Cleanup and commenting.	2012-02-22 13:13:18 +00:00
Gabriel Wicke	ffec77273a	Comment and minor code tweaks.	2012-02-21 11:24:20 +00:00
Gabriel Wicke	5806705733	Push transformer setup a bit further into the attribute pipeline.	2012-02-20 12:56:00 +00:00
Gabriel Wicke	71e95bd54b	Set up token stream transformers from a map of phases per input content type. Not yet applied to attribute pipeline creation. 249 tests passing.	2012-02-20 11:07:21 +00:00
Gabriel Wicke	001194b140	Replace console.log with console.warn in all debug statements	2012-02-14 20:56:14 +00:00
Gabriel Wicke	6983481561	Move attribute expansion back to separate handler, as this makes it easier to only expand used branches selected by parser functions. Template (and -argument) expansion is simply registered before general expansion. Additionally, a few more simple time-based magic words are added in ParserFunctions.	2012-02-09 13:44:20 +00:00
Gabriel Wicke	1f6db903e9	Pluck a few low-hanging fruit in external link tokenization, and add a simple localurl parser function implementation. 230 parser tests now passing.	2012-02-07 10:28:23 +00:00
Gabriel Wicke	53bf4f2bd0	Temporarily disable the sanitizer and start to support preprocessor functionality (comments, templates, template arguments) in arbitrary attributes. The grammar for this is still quite rough, will need to consolidate that area.	2012-02-06 19:15:44 +00:00
Gabriel Wicke	14a8a13678	A few more debug helpers including a --trace mode for light debugging. Some improvements to parser functions on the way to support the cite extensions. Preparation for generic template and template arg in attribute support. 222 parser tests now passing.	2012-01-31 16:50:16 +00:00
Gabriel Wicke	7cd94df47d	A few minor tweaks to reduce memory usage	2012-01-27 13:32:44 +00:00
Gabriel Wicke	1a6546fbca	Support empty template arguments and default values in arg expansion	2012-01-21 03:03:33 +00:00
Gabriel Wicke	145df2655c	* NoInclude and IncludeOnly improvements * Tokenizer support for templates and template args in template arguments and titles * Async attribute expansion fixes	2012-01-20 22:02:23 +00:00
Gabriel Wicke	348cac6cf0	Fix a bug in TokenCollector, and misc tweaks for template expansions.	2012-01-20 18:47:17 +00:00
Gabriel Wicke	fc2088bb21	Add some rudimentary noinclude / includeonly support and fix up TokenCollector.	2012-01-20 01:46:16 +00:00
Gabriel Wicke	2233d0a488	Eventify parser tests and parse.js commandline wrapper to actuallly allow async template fetching. Async expansion is not yet fully debugged, but at least the preconditions for that are now there.	2012-01-18 23:46:01 +00:00
Gabriel Wicke	14e6728cc4	Add the start of a minimal sanitizer stage, that only strips IDN ignored characters from host portions of links hrefs for now. This module needs to be filled up with pretty much everything Sanitizer.php does, including tag and attribute whitelists and attribute value sanitation (especially for style attributes). We'll also need to think about round-tripping of sanitized tokens.	2012-01-18 01:42:56 +00:00
Gabriel Wicke	e7381da5b8	Trim whitespace off template titles and argument names. 209 parser tests now passing.	2012-01-17 23:18:33 +00:00
Gabriel Wicke	f50fecf1e3	Fix template argument expansion. 200 parser tests now passing.	2012-01-17 22:29:26 +00:00
Gabriel Wicke	6bd7ca1e75	Misc improvements, now 196 parser tests passing. * Add handler for post-expand paragraph wrapping on token stream, to handle things like comments on its own line post-expand * Add general Util module * Fix self-closing tag handling in HTML5 tree builder	2012-01-17 18:22:10 +00:00
Gabriel Wicke	f4081bef08	First template expansion tests start working, and a bug fix in DOMPostProcessor paragraph wrapper. 187 parser tests now passing.	2012-01-14 00:58:20 +00:00
Gabriel Wicke	196d704e8e	Template expansion now enabled and somewhat working, but template fetching still fails all the time.	2012-01-13 18:48:25 +00:00
Gabriel Wicke	32c9bccd7c	Results of early template expansion debugging. Still disabled by default, but getting closer.	2012-01-11 19:48:49 +00:00
Gabriel Wicke	6b6ec2933d	More work towards template expansion. * Created AttributeTokenTransformManager for generic attribute conversion, and removed { title, template argument {key, value} } expansion from TemplateHandler. * Added caching for attribute and input sub-pipelines. Especially attribute pipelines would otherwise be recreated for each attribute value and key.	2012-01-11 00:05:51 +00:00
Gabriel Wicke	5ec30252f1	More token transform and pipeline setup refactoring to support template expansion better.	2012-01-10 01:09:50 +00:00
Gabriel Wicke	287604c422	A bit of cleanup in ParserPipeline, with better and more consistent support for multiple input types.	2012-01-09 19:33:49 +00:00
Gabriel Wicke	e99d7a2a55	Two batteries worth of token transform manager refactoring. * TokenTransformDispatcher is now renamed to TokenTransformManager, and is also turned into a base class * SyncTokenTransformManager and AsyncTokenTransformManager subclass TokenTransformManager and implement synchronous (phase 1,3) and asynchronous (phase 2) transformation stages. * Communication between stages uses the same chunk / end events as all the other token stages. * The AsyncTokenTransformManager now supports the creation of nested AsyncTokenTransformManagers for template expansion. The AsyncTokenTransformManager object takes on the responsibilities of a preprocessor frame. Transforms are newly created (or potentially resurrected from a cache), so that transforms do not have to worry about concurrency. * The environment is pushed through to all transform managers and the individual transforms.	2012-01-09 17:49:16 +00:00
Gabriel Wicke	6cd95fea37	Fix up constructors in EventEmitter inheritance and tweak a few more comments.	2012-01-04 12:28:41 +00:00
Gabriel Wicke	e3ae9a702b	Fix JSHint warnings (mostly about comment indentation) from r108012.	2012-01-04 11:06:24 +00:00
Gabriel Wicke	4c4a24f0a0	Hook up the DOMPostProcessor using events as well, and rename the subscription methods to tell a story. Also document idea on how to dynamically configure the pipeline depending on event registrations in comment.	2012-01-04 11:00:54 +00:00

1 2