wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-11-28 16:20:52 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	5ec30252f1	More token transform and pipeline setup refactoring to support template expansion better.	2012-01-10 01:09:50 +00:00
Gabriel Wicke	2e35171fd1	Fix quote handling and tweak the whitelist a bit. 'any' token registrations are now merged with specific registrations by rank. Not yet clear if that is a good idea overall, need to check use cases when implementing template expansion and other functionality. 183 parser test now passing.	2012-01-04 14:09:05 +00:00
Gabriel Wicke	29362cc53c	Rename ParseThingy to ParserPipeline and fix up broken WikiDom generation and commandline runner.	2012-01-04 08:39:45 +00:00
Gabriel Wicke	bd98eb4c5a	Land big TokenTransformDispatcher and eventization refactoring. The TokenTransformDispatcher now actually implements an asynchronous, phased token transformation framework as described in https://www.mediawiki.org/wiki/Future/Parser_development/Token_stream_transformations. Additionally, the parser pipeline is now mostly held together using events. The tokenizer still emits a lame single events with all tokens, as block-level emission failed with scoping issues specific to the PEGJS parser generator. All stages clean up when receiving the end tokens, so that the full pipeline can be used for repeated parsing. The QuoteTransformer is not yet 100% fixed to work with the new interface, and the Cite extension is disabled for now pending adaptation. Bold-italic related tests are failing currently.	2012-01-03 18:44:31 +00:00
Gabriel Wicke	8e00a72d0a	Improvements to link trail handling, and two tweaks to the whitelist. 182 tests now passing. Link trails depend on language-dependent positive character classes in the PHP parser. These classes all seem to disallow punctuation implicitly and list differing plain text characters instead, so it might be possible to get away with identifying a common class of non-trail punctuation instead. This would help to keep the tokenizer independent of configurations, which is very desirable for caching and simplified external parsing.	2011-12-30 12:47:06 +00:00
Gabriel Wicke	b3a0270d69	Remove env and load grammar in tokenizer constructor. Re-add property hack to keep parserTests running for now. Really need a different pipeline for html serialization or a reference to the HTML DOM.	2011-12-28 17:04:16 +00:00
Neil Kandalgaonkar	8fbf36e63e	put add terminal token inside tokenize method (will pull it out again for streaming interface)	2011-12-28 01:37:15 +00:00
Neil Kandalgaonkar	6103646ec8	remove need to add newline at end of input	2011-12-28 01:37:11 +00:00
Neil Kandalgaonkar	4158f82d7e	refactor parser to ParseThingy in different module, can be invoked with command line utility parse.js	2011-12-28 01:37:06 +00:00
Neil Kandalgaonkar	aedc6751ae	made parseThingy, temp class for refactoring all thingies related to parsing	2011-12-28 01:36:58 +00:00
Neil Kandalgaonkar	5ff2b4d475	make peg src path outside of peg tokenizer	2011-12-28 01:36:50 +00:00
Neil Kandalgaonkar	962d1262fc	create tokenizer without need to modify namespace with PEG source	2011-12-28 01:36:36 +00:00
Gabriel Wicke	1c7fe0eb34	Refactor table productions to support table fragments in templates (table start / row / table end). The old productions are not deleted yet to make it easy to compare the output on more complex articles. 181 tests passing after adding two table tests with whitespace-only differences to the whitelist.	2011-12-22 11:43:55 +00:00
Gabriel Wicke	574abd9774	A collection of small bug fixes to the grammar, Cite, the Token format converter and the HTML DOM -> WikiDom converter. The tokenizer now digests all parserTests.	2011-12-14 23:38:46 +00:00
Gabriel Wicke	dc77d73ad5	Add ability to pass through JSON data to WikiDom in data-json-* attributes, and fix parser to actually parse the Barack Obama article except for one table with nested templates at the start-of-line.	2011-12-14 17:25:09 +00:00
Gabriel Wicke	a09aa4d599	Add rough HTML DOM to WikiDom conversion. You can see serialized WikiDom of parser tests using 'node parserTests.js --wikidom'.	2011-12-14 15:15:41 +00:00
Gabriel Wicke	5f80d30428	Clean up access to document and body after building the tree.	2011-12-14 09:40:49 +00:00
Gabriel Wicke	feee9ded9f	Convert the Cite extension to a token stream transformer. This required a few further additions to the TokenTransformDispatcher. In particular, there is now an 'any' token match whose callbacks are executed before more specific callbacks. This is used by the Cite extension to eat all tokens between ref and /ref tags. This need is very common, so should be broken out to an intermediate layer in the future. In general, the requirements for the TokenTransformDispatcher API are now clearer, and the API should likely be cleaned up / simplified.	2011-12-13 14:48:47 +00:00
Gabriel Wicke	c33f74d227	Follow-up to r106001: Fix typo spotted by Nikerabbit. Good catch!	2011-12-13 13:00:57 +00:00
Gabriel Wicke	8e55e79b67	Rename TokenTransformer to TokenTransformDispatcher.	2011-12-13 11:45:12 +00:00
Gabriel Wicke	815c63ba6c	Disabled es* inclusion for now as the serializers are not currently used, and the recent addition of references to window are not compatible with node.js.	2011-12-13 11:17:33 +00:00
Gabriel Wicke	dc70687ed0	Update README	2011-12-13 10:03:01 +00:00
Gabriel Wicke	a8fa9433c4	Convert quote handling (italic/bold) to a core extension operating on the token stream. This is the first token transformation exercising the TokenTransformer class as its dispatcher. Template expansions, wiki link formatting, tag sanitation and extensions should be able to use the same dispatcher by registering for specific token types. The parser performance is very slightly improved as the token stream is only traversed once.	2011-12-12 20:53:14 +00:00
Gabriel Wicke	752b0990b2	Refactor parserTests somewhat into a class-like structure, and wire up the TokenTransformer.	2011-12-12 14:03:54 +00:00
Gabriel Wicke	d616f07a79	Don't re-build the wiki tokenizer for each test. This speeds up the full parserTests.js run slightly from 7-8 minutes to about 14 seconds ;) A few very minor tweaks to the grammar are also thrown into this commit.	2011-12-12 10:47:42 +00:00
Gabriel Wicke	abc2254110	A bit of comment clean-up and wrapping of tree building into try/catch block to actually count failures.	2011-12-08 11:40:59 +00:00
Gabriel Wicke	92fdf99384	Further renaming, this time from pegParser to pegTokenizer.	2011-12-08 10:59:44 +00:00
Gabriel Wicke	76bc477038	Rename html5TokenEmitter to HTML5TreeBuilder, and the contained Tokenizer to TreeBuilder.	2011-12-08 10:37:18 +00:00
Gabriel Wicke	1d299f6aa9	Also print out options for failing tests.	2011-12-07 11:45:05 +00:00
Gabriel Wicke	0734fb24c5	Add a few more items to the whitelist	2011-12-07 11:44:38 +00:00
Gabriel Wicke	7e1585d360	Add empty tables to the whitelist (legal in HTML5). Also add one more functionally identical italic/bold/link permmutation on the whitelist.	2011-12-06 22:05:43 +00:00
Trevor Parscal	e61e66856c	Fixed issue in transaction processor's insert method - no need for a special case for structural offsets anymore	2011-12-06 22:04:18 +00:00
Trevor Parscal	88f22ec10f	Added test which currently fails because Transaction processor is broken	2011-12-06 21:36:36 +00:00
Gabriel Wicke	1a5ffacc5c	Add slightly different but functionally identical italic/bold/link nesting to whitelist.	2011-12-06 16:45:19 +00:00
Gabriel Wicke	a922d595cf	Really minor: Add a newline after whitelist printout.	2011-12-06 13:16:43 +00:00
Gabriel Wicke	1bd3f8321e	Minor beautification of whitelist entry print-out header.	2011-12-06 12:35:32 +00:00
Gabriel Wicke	228fccd0c1	Strip toc and edit sections from expected html for now.	2011-12-06 11:39:53 +00:00
Antoine Musso	350d1e8978	util.inspect to dump tokens It gets a better output over JSON.stringify since inspect nicely indent the object/array dump. Makes it easier to read for humans.	2011-12-06 10:23:58 +00:00
Gabriel Wicke	33e19f7275	Recognize block-level elements independent of case; Ignore toc and section edit links in tests. 148 parser tests passing.	2011-12-05 20:03:24 +00:00
Trevor Parscal	07af0cab63	* Moved getContent and getText from leaf nodes to document model nodes * Renamed getContent to getContentData * Renamed getText to getContentText * Added getElementData	2011-12-05 19:41:04 +00:00
Gabriel Wicke	a6867d76c5	Ignore missing redlink for now, we are concerned with the parser and not a complete wiki at this stage.	2011-12-05 17:07:06 +00:00
Gabriel Wicke	1760210d13	Fixes to tables, headings and misc smaller stuff. Tracked down an issue caused by improperly caching of production results, which interfered with the flag-dependent inline_break production.	2011-12-04 19:23:24 +00:00
Antoine Musso	7ead617a2e	--cache to save the test cases parsing This is optional but speed up launchtime when other files are not modified.	2011-12-01 17:51:07 +00:00
Antoine Musso	c21a81ee45	warn on invalid regex passed to --filter	2011-12-01 15:45:40 +00:00
Gabriel Wicke	63c728924b	Use pegjs from npm	2011-12-01 15:23:23 +00:00
Gabriel Wicke	d00743ad79	Improve external links and definition lists, now 133 tests passing ;) Also add printwhitelist option to test runner, provides js code copy/pastable to whitelist.	2011-12-01 14:25:59 +00:00
Antoine Musso	cb682c5ade	option to disable color output (use --no-color )	2011-12-01 12:30:15 +00:00
Gabriel Wicke	5d50c6bbf3	Follow-up to r104845: s/args/argv	2011-12-01 12:10:43 +00:00
Gabriel Wicke	edf40c616c	Make whitelist usage an option; tweak comment a bit	2011-12-01 11:47:22 +00:00
Gabriel Wicke	5f72acec8f	Add option to disable whitelist	2011-12-01 11:08:05 +00:00

1 2 3

132 commits