wikimedia/mediawiki-extensions-VisualEditor - fanwikis.org Git Server

wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-11-15 18:39:52 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	d918fa18ac	Big token transform framework overhaul part 2 * Tokens are now immutable. The progress of transformations is tracked on chunks instead of tokens. Tokenizer output is cached and can be directly returned without a need for cloning. Transforms are required to clone or newly create tokens they are modifying. * Expansions per chunk are now shared between equivalent frames via a cache stored on the chunk itself. Equivalence of frames is not yet ideal though, as right now a hash tree of unexpanded arguments is used. This should be switched to a hash of the fully expanded local parameters instead. * There is now a vastly improved maybeSyncReturn wrapper for async transforms that either forwards processing to the iterative transformTokens if the current transform is still ongoing, or manages a recursive transformation if needed. * Parameters for parser functions are now wrapped in abstract Params and ParserValue objects, which support some handy on-demand value expansions. Keys are always expanded. Parser functions are converted to use these interfaces, and now properly expand their values in the correct frame. Making this expansion lazier is certainly possible, but would complicate transformTokens and other token-handling machinery. Need to investigate if it would really be worth it. Dead branch elimination is certainly a bigger win overall. * Complex recursive asynchronous expansions should now be closer to correct for both the iterative (transformTokens) and recursive (maybeSyncReturn after transformTokens has returned) code paths. * Performance degraded slightly. There are no micro-optimizations done yet and the shared expansion cache still has a low hit rate. The progress tracking on chunks is not yet perfect, so there are likely a lot of unneeded re-expansions that can be easily eliminated. There is also more debug tracing right now. Obama currently expands in 54 seconds on my laptop. Change-Id: I4a603f3d3c70ca657ebda9fbb8570269f943d6b6	2012-05-15 17:05:47 +02:00
Catrope	c256ea7d71	Fix fatal error in parse.js Trying something trivial like echo 'Hello world' \| node parse.js would throw TypeError: Function.prototype.apply: Arguments list has wrong type Change-Id: Ia0a1154b0f3edbfb1f228a1d2072fced1b147141	2012-05-10 12:04:57 -07:00
Gabriel Wicke	6e21f6bb27	Forward-port Cite extension * Adapted Cite extension to use current interfaces and token formats * Improved TokenCollector Change-Id: I20419b19edd9bbad2c2abf17a2ff1411b99c0c04	2012-05-03 13:22:01 +02:00
Gabriel Wicke	2291fe8364	Reduce the need for token cloning slightly Change-Id: I31c71bddca4855afdffc3fe5c8d759cfa1994d86	2012-04-27 23:12:25 +02:00
Gabriel Wicke	5fb2c46073	Clone cached tokens, and fix switch for empty needle Change-Id: I63946e5a56f6fd7dd30d00b12d36032dd1dd0017	2012-04-27 15:59:01 +02:00
Gabriel Wicke	ed8cb54831	Simplify transformToken slightly, and fix JSHint warnings Change-Id: I95769ed063ea855a9109148f5db83ea43f423e56	2012-04-27 15:31:30 +02:00
Gabriel Wicke	2d7b4a2a59	Make .to more consistent and add optional parentCB arg * parentCB (if set) is called with { async: true } if expansion is going to be asynchronous. * Strings are handled efficiently * all value parameter chunks can now be converted using .to(). Change-Id: Ib013e1bc3d8e7f692009038209db6a056887326e	2012-04-27 13:57:23 +02:00
Gabriel Wicke	fd1a67aa16	Add .to('text/plain/expanded', cb) support and convert ifeq to use it Change-Id: I99c78de12fed41ba36811402f7ecacb420391d70	2012-04-27 12:18:30 +02:00
Gabriel Wicke	3be4992782	'Obama finally expands' ;) Misc fixes and documentation updates * [[:en:Barack Obama]] can now be expanded in 77 seconds using 330MB RAM, while it would prevously run out of RAM after ~30 minutes. Wohoooo! The token transform framework rework really paid off. * 303 parser tests are passing in the new record time of 5.5 seconds. Two more tests are passing since these tests expect the day of the week to be Thursday. Won't be the case tomorrow. Change-Id: I56e850838476b546df10c6a239c8c9e29a1a3136	2012-04-26 18:18:08 +02:00
Gabriel Wicke	8ff810659a	Rename text/wiki and tokens/wiki to text/x-mediawiki and similar Change-Id: I70113629f4633685cd6db3914303a15e4c79a50a	2012-04-25 20:19:43 +02:00
Gabriel Wicke	8368e17d6a	Biggish token transform system refactoring * All parser pipelines including tokenizer and DOM stuff are now constructed from a 'recipe' data structure in a ParserPipelineFactory. * All sub-pipelines of these can now be cached * Event registrations to a pipeline are directly forwarded to the last pipeline member to save relatively expensive event forwarding. * Some APIs for on-demand expansion / format conversion of parameters from parser functions are added: param.to('tokens/expanded', cb) param.to('text/wiki', cb) (this does not work yet) All parameters are additionally wrapped into a Param object that provides method for positional parameter naming (.named() or conversion to a dict (.dict()). * The async token transform manager is now separated from a frame object, with the frame holding arguments, an on-demand expansion method and loop checks. * Only keys of template parameters are now expanded. Parser functions or template arguments trigger an expansion on-demand. This (unsurprisingly) makes a big performance difference with typical switch-heavy template systems. * Return values from async transforms are no longer used in favor of plain callbacks. This saves the complication of having to maintain two code paths. A trick in transformTokens still avoids the construction of unneeded TokenAccumulators. * The results of template expansions are no longer buffered. * 301 parser tests are passing Known issues: * Cosmetic cleanup remains to do * Some parser functions do not support async expansions yet, and need to be modified. Change-Id: I1a7690baffbe8141cadf67270904a1b2e1df879a	2012-04-25 16:51:36 +02:00
Gabriel Wicke	e2ca8c24c7	Delay some token duplication until actual mutation happens This is a bit better than cloning tokens wholesale, but not by much. There is a lot of potential for much better per-token caching with reduced token cloning. Need to map out all dependencies besides token attributes expanded from template parameters or other scoped state. Even if tokens themselves don't need transformation, they might still need to be considered for other token transformers, so simply keeping the final rank won't quite work even if the token itself is fully transformed. As a minimum, a shallow clone would need to be made and the rank reset (as in env.cloneTokens). Change-Id: I4329113bb21750bae9a635229ed1b08da75dc614	2012-04-18 17:53:04 +02:00
Gabriel Wicke	bf84638bc0	Add tokenizer cache and clone token state on mutation * Added an LRU cache (using the lru-cache node module) for tokenizer output * Mutation of nested attributes now replaces the containers. A shallow copy of tokens is sufficient to isolate token transformations. Need to investigate if we can actually get away without isolation and re-transformation for most ordinary tokens. Change-Id: I9136b1d7a1fbcc538183a319d4ecaa290d616fdf	2012-04-18 14:40:47 +02:00
Gabriel Wicke	c688b039de	Collected tweaks * less verbose logging in noinclude processing and template expansion * Give priority to the processing of templates transcluded from transclusions to get closer to depth-first processing. This serves to minimize memory usage from queued-up tokens. * Increase the maximum outstanding requests per template retrieval. 10000 amazingly proved too low a limit on some big pages. * Only process a single template request callback at a time for now * Add a debug print in the treebuilder wrapper * Don't treat multiple comments on a single line as a single comment to match the PHP parser's behavior Change-Id: I9a86b6d7bec3b9e1f17415daf1bf74170240721a	2012-04-16 15:47:03 +02:00
Gabriel Wicke	5bb2d96869	Token stream transform improvements * add past paths for empty arguments etc * cache attribute token transform pipelines * fix bugs in TokenCollector and NoIncludeOnly handler, and improve its efficiency by only registering for 'end' tokens on demand * Remove empty reset methods from a few handlers * Add a simple 'ap' debug print function that makes it easy to only print some debug prints by temporarily changing 'dp' to 'ap' * Improvements and bug fixes in AttributeExpander Change-Id: Ie69729c8f62d48bba922712e44ebce484c621c50	2012-04-12 15:42:09 +02:00
Gabriel Wicke	9ae572cca0	Fixes to template expansion / token transform managers, 296 tests passing. * Convert isNoInclude logic to positive isInclude throughout and set it properly on attribute pipelines. Also don't cache non-include pipelines. * Add a --pagename parameter to parse.js, which sets the page name in the environment. This is then returned by {{PAGENAME}}. Not the final solution, but useful for taxobox testing as taxons are selected based on PAGENAME. * Add rudimentary pagenamebase parser function Change-Id: If9c0be4c255200d0f2a30f02e5619437b4fd8f12	2012-04-11 16:34:27 +02:00
Roan Kattouw	29f416937e	Fix some usages of splice.apply in the data model to use ve.batchedSplice(). Added FIXME comments for occurrences outside of DM	2012-03-10 00:31:28 +00:00
Gabriel Wicke	f02ff95aa3	Token representation clean-up. Now all tokens are differentiated using constructors instead of type attributes.	2012-03-07 20:06:54 +00:00
Gabriel Wicke	f157093a41	Delegate responsibility for resetting the token rank to transforms, if full re-processing in a phase is wanted. By default, after a token type change or the return of multiple tokens only the remaining transforms with higher ranks are applied. Updated a few comments as well.	2012-03-07 19:29:53 +00:00
Gabriel Wicke	e5a1116817	Start re-transformation as soon as possible in TokenAccumulator._returnTokens to maximize IO concurrency. Signal that all tokens are fully transformed to callbacks called from TokenAccumulator._returnTokens. The result should be a single re-transformation when entering the callback chain, and only if the transform does not signal that it took care of full transformation itself. Template expansion would set this flag, as the nested transform pipeline processes all tokens to the end of phase async12.	2012-03-07 16:29:06 +00:00
Gabriel Wicke	656524dbbc	Fixes for multi-transformer expansion in AsyncTransformManager. Added argument to callback which lets transforms indicate if their returned tokens are fully processed for their phase. If not, the callback re-processes them so that any remaining transforms are applied.	2012-03-07 15:39:18 +00:00
Gabriel Wicke	af03eb4f29	Improve generic attribute expansion before external link processing, and make wgUploadPath configurable. Also change the hard-coded fall-back image sizes to sensible defaults. This breaks three parser tests until image size retrieval from the wiki is implemented.	2012-03-06 18:02:35 +00:00
Gabriel Wicke	7f7202e89c	A few improvements to external link and image handling. 264 tests passing.	2012-03-05 15:34:27 +00:00
Gabriel Wicke	b8bb503199	Actually commit onlyinclude, as already announced in r112592.	2012-02-28 13:24:35 +00:00
Gabriel Wicke	d7da324272	Basic fall-through support for #switch parser function	2012-02-22 14:57:50 +00:00
Gabriel Wicke	491ad5ffef	Cleanup and commenting.	2012-02-22 13:13:18 +00:00
Gabriel Wicke	8dde1f77b4	Reduce debug print overhead, roughly a 10% speed-up on parserTests.	2012-02-21 18:49:43 +00:00
Gabriel Wicke	ffec77273a	Comment and minor code tweaks.	2012-02-21 11:24:20 +00:00
Gabriel Wicke	5806705733	Push transformer setup a bit further into the attribute pipeline.	2012-02-20 12:56:00 +00:00
Gabriel Wicke	001194b140	Replace console.log with console.warn in all debug statements	2012-02-14 20:56:14 +00:00
Gabriel Wicke	0b8d1b0387	* Add custom toString methods for tokens to aid debugging * Convert all attributes into strings in Sanitizer * Use strict comparison against empty string in tokenizer * Add very simple sitename parserfunction * 138 tests passing	2012-02-13 17:02:23 +00:00
Gabriel Wicke	6e33255503	Improve support for preprocessor functionality in attributes; Support multi-line xmlish tags with preprocessor stuff in attributes.	2012-02-09 16:36:29 +00:00
Gabriel Wicke	6983481561	Move attribute expansion back to separate handler, as this makes it easier to only expand used branches selected by parser functions. Template (and -argument) expansion is simply registered before general expansion. Additionally, a few more simple time-based magic words are added in ParserFunctions.	2012-02-09 13:44:20 +00:00
Gabriel Wicke	3f7c1499cd	Enable support for general preprocessor functionality in attribute keys and values. This includes comments, templates and template arguments. This also replaces the specialized expansion logic in the TemplateHandler. The removal of link validation lets one more parser test fail for now. External link target validation will need to be implemented in the token stream handler for links. This is noted as TODO in https://www.mediawiki.org/wiki/Future/Parser_development#Token_stream_transforms.	2012-02-08 15:10:30 +00:00
Gabriel Wicke	b4892102a4	Clean up transform callback interface	2012-02-07 11:53:29 +00:00
Gabriel Wicke	8c75aa1a7a	Remove type attribute for tag tokens.	2012-02-01 18:37:48 +00:00
Gabriel Wicke	689f697a93	Push token format conversion a bit further along, and add defines that were missing in last commit.	2012-02-01 17:03:08 +00:00
Gabriel Wicke	a5cc10a06b	Change token format to plain strings for text tokens, and specific objects for other tokens. This is only the first half of the conversion. The next step is to drop the type attribute on most tokens and match on the constructor in the token transform machinery.	2012-02-01 16:30:43 +00:00
Gabriel Wicke	14a8a13678	A few more debug helpers including a --trace mode for light debugging. Some improvements to parser functions on the way to support the cite extensions. Preparation for generic template and template arg in attribute support. 222 parser tests now passing.	2012-01-31 16:50:16 +00:00
Gabriel Wicke	4e6a54560a	* Emit token chunks for top-level block elements by patching the source of the tokenizer * Fix a bug uncovered by this * Increase the number of outstanding listeners on a single download to 10000	2012-01-22 23:21:53 +00:00
Gabriel Wicke	7ea4d7d3db	A few parser function fixes and maximum template expansion in environment config.	2012-01-22 19:32:28 +00:00
Gabriel Wicke	561cf3c237	Bug fixes and a first stab at a #time parser function. You can expand the main page like this: cd extensions/VisualEditor/modules/parser echo '{{:Main Page}}' \| node parse.js echo '{{:Main Page}}' \| node parse.js --html echo '{{:Main Page}}' \| node parse.js --debug Even the date-based includes work somewhat, although they don't yet accept passed-in dates.	2012-01-22 07:07:16 +00:00
Gabriel Wicke	60e45bb739	A bit of template expansion bug fixing and parser function documentation	2012-01-22 01:27:22 +00:00
Gabriel Wicke	785a4af76f	Implement a few parser functions. 220 parser tests now passing.	2012-01-21 20:38:13 +00:00
Gabriel Wicke	1a6546fbca	Support empty template arguments and default values in arg expansion	2012-01-21 03:03:33 +00:00
Gabriel Wicke	145df2655c	* NoInclude and IncludeOnly improvements * Tokenizer support for templates and template args in template arguments and titles * Async attribute expansion fixes	2012-01-20 22:02:23 +00:00
Gabriel Wicke	fc2088bb21	Add some rudimentary noinclude / includeonly support and fix up TokenCollector.	2012-01-20 01:46:16 +00:00
Gabriel Wicke	d0ece16c86	Fix async template expansion, so we can now render simple pages with templates directly to WikiDom from enwiki using a commandline like this: echo '{{User:GWicke/Test}}' \| node parse.js Wohoo! Complex pages with templates won't render properly yet, as noinclude / includeonly and parser functions are not yet implemented. As a result, the parser will run out of memory or hit the currently low expansion depth limit as it tries to expand documentation for all templates.	2012-01-19 23:43:39 +00:00
Gabriel Wicke	f50fecf1e3	Fix template argument expansion. 200 parser tests now passing.	2012-01-17 22:29:26 +00:00
Gabriel Wicke	f4081bef08	First template expansion tests start working, and a bug fix in DOMPostProcessor paragraph wrapper. 187 parser tests now passing.	2012-01-14 00:58:20 +00:00

1 2