wikimedia/mediawiki-extensions-VisualEditor - fanwikis.org Git Server

wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-11-24 22:35:41 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	3be4992782	'Obama finally expands' ;) Misc fixes and documentation updates * [[:en:Barack Obama]] can now be expanded in 77 seconds using 330MB RAM, while it would prevously run out of RAM after ~30 minutes. Wohoooo! The token transform framework rework really paid off. * 303 parser tests are passing in the new record time of 5.5 seconds. Two more tests are passing since these tests expect the day of the week to be Thursday. Won't be the case tomorrow. Change-Id: I56e850838476b546df10c6a239c8c9e29a1a3136	2012-04-26 18:18:08 +02:00
Gabriel Wicke	8ff810659a	Rename text/wiki and tokens/wiki to text/x-mediawiki and similar Change-Id: I70113629f4633685cd6db3914303a15e4c79a50a	2012-04-25 20:19:43 +02:00
Gabriel Wicke	8368e17d6a	Biggish token transform system refactoring * All parser pipelines including tokenizer and DOM stuff are now constructed from a 'recipe' data structure in a ParserPipelineFactory. * All sub-pipelines of these can now be cached * Event registrations to a pipeline are directly forwarded to the last pipeline member to save relatively expensive event forwarding. * Some APIs for on-demand expansion / format conversion of parameters from parser functions are added: param.to('tokens/expanded', cb) param.to('text/wiki', cb) (this does not work yet) All parameters are additionally wrapped into a Param object that provides method for positional parameter naming (.named() or conversion to a dict (.dict()). * The async token transform manager is now separated from a frame object, with the frame holding arguments, an on-demand expansion method and loop checks. * Only keys of template parameters are now expanded. Parser functions or template arguments trigger an expansion on-demand. This (unsurprisingly) makes a big performance difference with typical switch-heavy template systems. * Return values from async transforms are no longer used in favor of plain callbacks. This saves the complication of having to maintain two code paths. A trick in transformTokens still avoids the construction of unneeded TokenAccumulators. * The results of template expansions are no longer buffered. * 301 parser tests are passing Known issues: * Cosmetic cleanup remains to do * Some parser functions do not support async expansions yet, and need to be modified. Change-Id: I1a7690baffbe8141cadf67270904a1b2e1df879a	2012-04-25 16:51:36 +02:00
Adam Wight	b234edba88	As much as I have loved writing Makefiles... I've replaced its functionality with package.json, mostly so we can avoid non-node dependencies. This is one of the recommended practices. We should consider moving tests/parser into modules/parser/tests, other node projects keep all module code in one directory. Explained in the README how to use npm to load the dependencies and run tests. Too bad about NODE_PATH... Don't try to find parserTests.txt in assorted places--if it isn't present, fetch from gerrit. You can symlink from core if you're developing on both parsers, and the fetch script will not overwrite. Use __dirname in parserTests.js to allow the script to run independent of current working directory. Change-Id: I4c8b884e91f4fdeae385c7697aff768bdd199dd5	2012-04-04 11:02:58 -07:00
Gabriel Wicke	f662690d02	Shorten data-mw-rt to data-mw and clean up whitelist Instead of a proliferation of data-mw-* attributes, it should be easier to stash all private / non-semantic round-trip information in a JSON object stored in data-mw. Change-Id: Id200a6a8789fa152f29ea530e5a24b6ee7b4b285	2012-04-02 18:12:49 +02:00
Gabriel Wicke	5ef3438ee5	Change path to parserTests from phase3 to core after switch to git. Change-Id: Ie13f678eaa81447e98db5c8c394ab103caad8454	2012-04-02 17:10:06 +02:00
Gabriel Wicke	af03eb4f29	Improve generic attribute expansion before external link processing, and make wgUploadPath configurable. Also change the hard-coded fall-back image sizes to sensible defaults. This breaks three parser tests until image size retrieval from the wiki is implemented.	2012-03-06 18:02:35 +00:00
Gabriel Wicke	7b0c807710	Change wikilink tokenization strategy to split on pipes. This makes it possible to support template / template argument expansion in image options, and causes little trouble for wikilinks. Non-image wikilinks with multiple text pipes are quite rare in the dumps, and concatenating description tokens with a plain '\|' is quite easy. 261 parser tests passing.	2012-03-05 12:00:38 +00:00
Gabriel Wicke	7daeb34d4d	Implement onlyinclude transformer. 254 tests passing.	2012-02-28 13:21:01 +00:00
Gabriel Wicke	2e41b19af8	Green two more parser tests by implementing some parser functions.	2012-02-22 16:39:50 +00:00
Gabriel Wicke	3568dfee14	Add some support for functionhooks in test parser and parserTests.js, and tweak a few parser functions.	2012-02-22 15:59:11 +00:00
au	0360e62da7	* Locally apply the HTML5.Marker.type patch. This is needed until https://github.com/aredridel/html5/issues/44 is merged into the upstream "html5" module.	2012-02-18 17:28:35 +00:00
Gabriel Wicke	025f9cddb3	Prefix all internal data- attributes with data-mw- and adjust the whitelist and test output normalization accordingly. 235 tests passing.	2012-02-13 13:54:07 +00:00
Gabriel Wicke	a122e51eec	Move data-* annotations into separate object on tokens, that is then serialized into a single data-mw-rt attribute if present. Update parserTests to ignore this attribute for comparisons with expected parser output. A few more tweaks and notes are thrown into this commit too. 233 tests are passing now.	2012-02-11 16:43:25 +00:00
Gabriel Wicke	1f6db903e9	Pluck a few low-hanging fruit in external link tokenization, and add a simple localurl parser function implementation. 230 parser tests now passing.	2012-02-07 10:28:23 +00:00
Gabriel Wicke	d321d96bab	Fix parserTests summary with filtering enabled	2012-02-07 09:27:47 +00:00
Gabriel Wicke	a5b7ea7bcd	Add --debug and --trace options to parserTests as well.	2012-02-01 17:02:37 +00:00
Gabriel Wicke	7cd94df47d	A few minor tweaks to reduce memory usage	2012-01-27 13:32:44 +00:00
Gabriel Wicke	348cac6cf0	Fix a bug in TokenCollector, and misc tweaks for template expansions.	2012-01-20 18:47:17 +00:00
Gabriel Wicke	2233d0a488	Eventify parser tests and parse.js commandline wrapper to actuallly allow async template fetching. Async expansion is not yet fully debugged, but at least the preconditions for that are now there.	2012-01-18 23:46:01 +00:00
Gabriel Wicke	f4081bef08	First template expansion tests start working, and a bug fix in DOMPostProcessor paragraph wrapper. 187 parser tests now passing.	2012-01-14 00:58:20 +00:00
Gabriel Wicke	5ec30252f1	More token transform and pipeline setup refactoring to support template expansion better.	2012-01-10 01:09:50 +00:00
Gabriel Wicke	29362cc53c	Rename ParseThingy to ParserPipeline and fix up broken WikiDom generation and commandline runner.	2012-01-04 08:39:45 +00:00
Gabriel Wicke	bd98eb4c5a	Land big TokenTransformDispatcher and eventization refactoring. The TokenTransformDispatcher now actually implements an asynchronous, phased token transformation framework as described in https://www.mediawiki.org/wiki/Future/Parser_development/Token_stream_transformations. Additionally, the parser pipeline is now mostly held together using events. The tokenizer still emits a lame single events with all tokens, as block-level emission failed with scoping issues specific to the PEGJS parser generator. All stages clean up when receiving the end tokens, so that the full pipeline can be used for repeated parsing. The QuoteTransformer is not yet 100% fixed to work with the new interface, and the Cite extension is disabled for now pending adaptation. Bold-italic related tests are failing currently.	2012-01-03 18:44:31 +00:00
Gabriel Wicke	b3a0270d69	Remove env and load grammar in tokenizer constructor. Re-add property hack to keep parserTests running for now. Really need a different pipeline for html serialization or a reference to the HTML DOM.	2011-12-28 17:04:16 +00:00
Neil Kandalgaonkar	8fbf36e63e	put add terminal token inside tokenize method (will pull it out again for streaming interface)	2011-12-28 01:37:15 +00:00
Neil Kandalgaonkar	6103646ec8	remove need to add newline at end of input	2011-12-28 01:37:11 +00:00
Neil Kandalgaonkar	4158f82d7e	refactor parser to ParseThingy in different module, can be invoked with command line utility parse.js	2011-12-28 01:37:06 +00:00
Neil Kandalgaonkar	aedc6751ae	made parseThingy, temp class for refactoring all thingies related to parsing	2011-12-28 01:36:58 +00:00
Neil Kandalgaonkar	5ff2b4d475	make peg src path outside of peg tokenizer	2011-12-28 01:36:50 +00:00
Neil Kandalgaonkar	962d1262fc	create tokenizer without need to modify namespace with PEG source	2011-12-28 01:36:36 +00:00
Gabriel Wicke	1c7fe0eb34	Refactor table productions to support table fragments in templates (table start / row / table end). The old productions are not deleted yet to make it easy to compare the output on more complex articles. 181 tests passing after adding two table tests with whitespace-only differences to the whitelist.	2011-12-22 11:43:55 +00:00
Gabriel Wicke	574abd9774	A collection of small bug fixes to the grammar, Cite, the Token format converter and the HTML DOM -> WikiDom converter. The tokenizer now digests all parserTests.	2011-12-14 23:38:46 +00:00
Gabriel Wicke	dc77d73ad5	Add ability to pass through JSON data to WikiDom in data-json-* attributes, and fix parser to actually parse the Barack Obama article except for one table with nested templates at the start-of-line.	2011-12-14 17:25:09 +00:00
Gabriel Wicke	a09aa4d599	Add rough HTML DOM to WikiDom conversion. You can see serialized WikiDom of parser tests using 'node parserTests.js --wikidom'.	2011-12-14 15:15:41 +00:00
Gabriel Wicke	5f80d30428	Clean up access to document and body after building the tree.	2011-12-14 09:40:49 +00:00
Gabriel Wicke	feee9ded9f	Convert the Cite extension to a token stream transformer. This required a few further additions to the TokenTransformDispatcher. In particular, there is now an 'any' token match whose callbacks are executed before more specific callbacks. This is used by the Cite extension to eat all tokens between ref and /ref tags. This need is very common, so should be broken out to an intermediate layer in the future. In general, the requirements for the TokenTransformDispatcher API are now clearer, and the API should likely be cleaned up / simplified.	2011-12-13 14:48:47 +00:00
Gabriel Wicke	8e55e79b67	Rename TokenTransformer to TokenTransformDispatcher.	2011-12-13 11:45:12 +00:00
Gabriel Wicke	815c63ba6c	Disabled es* inclusion for now as the serializers are not currently used, and the recent addition of references to window are not compatible with node.js.	2011-12-13 11:17:33 +00:00
Gabriel Wicke	a8fa9433c4	Convert quote handling (italic/bold) to a core extension operating on the token stream. This is the first token transformation exercising the TokenTransformer class as its dispatcher. Template expansions, wiki link formatting, tag sanitation and extensions should be able to use the same dispatcher by registering for specific token types. The parser performance is very slightly improved as the token stream is only traversed once.	2011-12-12 20:53:14 +00:00
Gabriel Wicke	752b0990b2	Refactor parserTests somewhat into a class-like structure, and wire up the TokenTransformer.	2011-12-12 14:03:54 +00:00
Gabriel Wicke	d616f07a79	Don't re-build the wiki tokenizer for each test. This speeds up the full parserTests.js run slightly from 7-8 minutes to about 14 seconds ;) A few very minor tweaks to the grammar are also thrown into this commit.	2011-12-12 10:47:42 +00:00
Gabriel Wicke	abc2254110	A bit of comment clean-up and wrapping of tree building into try/catch block to actually count failures.	2011-12-08 11:40:59 +00:00
Gabriel Wicke	92fdf99384	Further renaming, this time from pegParser to pegTokenizer.	2011-12-08 10:59:44 +00:00
Gabriel Wicke	76bc477038	Rename html5TokenEmitter to HTML5TreeBuilder, and the contained Tokenizer to TreeBuilder.	2011-12-08 10:37:18 +00:00
Gabriel Wicke	1d299f6aa9	Also print out options for failing tests.	2011-12-07 11:45:05 +00:00
Gabriel Wicke	a922d595cf	Really minor: Add a newline after whitelist printout.	2011-12-06 13:16:43 +00:00
Gabriel Wicke	1bd3f8321e	Minor beautification of whitelist entry print-out header.	2011-12-06 12:35:32 +00:00
Gabriel Wicke	228fccd0c1	Strip toc and edit sections from expected html for now.	2011-12-06 11:39:53 +00:00
Antoine Musso	350d1e8978	util.inspect to dump tokens It gets a better output over JSON.stringify since inspect nicely indent the object/array dump. Makes it easier to read for humans.	2011-12-06 10:23:58 +00:00

1 2