wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-11-15 18:39:52 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	bd98eb4c5a	Land big TokenTransformDispatcher and eventization refactoring. The TokenTransformDispatcher now actually implements an asynchronous, phased token transformation framework as described in https://www.mediawiki.org/wiki/Future/Parser_development/Token_stream_transformations. Additionally, the parser pipeline is now mostly held together using events. The tokenizer still emits a lame single events with all tokens, as block-level emission failed with scoping issues specific to the PEGJS parser generator. All stages clean up when receiving the end tokens, so that the full pipeline can be used for repeated parsing. The QuoteTransformer is not yet 100% fixed to work with the new interface, and the Cite extension is disabled for now pending adaptation. Bold-italic related tests are failing currently.	2012-01-03 18:44:31 +00:00
Gabriel Wicke	8e00a72d0a	Improvements to link trail handling, and two tweaks to the whitelist. 182 tests now passing. Link trails depend on language-dependent positive character classes in the PHP parser. These classes all seem to disallow punctuation implicitly and list differing plain text characters instead, so it might be possible to get away with identifying a common class of non-trail punctuation instead. This would help to keep the tokenizer independent of configurations, which is very desirable for caching and simplified external parsing.	2011-12-30 12:47:06 +00:00
Gabriel Wicke	11ece76b7b	Fix suffix handling for wiki links.	2011-12-30 09:35:57 +00:00
Gabriel Wicke	33e60dd4d9	Update comments a bit.	2011-12-22 12:37:24 +00:00
Gabriel Wicke	9ee0e660ec	Fix regression introduced by r107060 for regular table cells. Good to have a test suite ;)	2011-12-22 12:09:25 +00:00
Gabriel Wicke	a94d0ec10c	Re-add support for row-only tables.	2011-12-22 11:58:32 +00:00
Gabriel Wicke	1c7fe0eb34	Refactor table productions to support table fragments in templates (table start / row / table end). The old productions are not deleted yet to make it easy to compare the output on more complex articles. 181 tests passing after adding two table tests with whitespace-only differences to the whitelist.	2011-12-22 11:43:55 +00:00
Gabriel Wicke	2845ba9552	Handle noinclude and includeonly at start of line, so that syntax after it still matches as if it actually was preceded by a newline.	2011-12-21 11:38:50 +00:00
Gabriel Wicke	cc06551f2e	Rename table_header production to table_heading. Those non-natives strike again.	2011-12-16 19:24:59 +00:00
Gabriel Wicke	605ed23fd2	Fix attributes in table headings.	2011-12-16 19:22:13 +00:00
Gabriel Wicke	a04744b2ec	Add some more attribute remapping capabilities to the DOMConverter, and clean up some grammar formatting.	2011-12-15 17:33:07 +00:00
Gabriel Wicke	3585bd9c8e	Accept row-only tables. The parser now eats [[en:Barack Obama]] as-is. Hooray!	2011-12-15 00:39:28 +00:00
Gabriel Wicke	6df94a34a1	Less lust for urls	2011-12-15 00:26:22 +00:00
Gabriel Wicke	ce2ee067f7	Minor tweak to wiki link production	2011-12-15 00:12:58 +00:00
Gabriel Wicke	574abd9774	A collection of small bug fixes to the grammar, Cite, the Token format converter and the HTML DOM -> WikiDom converter. The tokenizer now digests all parserTests.	2011-12-14 23:38:46 +00:00
Gabriel Wicke	dc77d73ad5	Add ability to pass through JSON data to WikiDom in data-json-* attributes, and fix parser to actually parse the Barack Obama article except for one table with nested templates at the start-of-line.	2011-12-14 17:25:09 +00:00
Gabriel Wicke	feee9ded9f	Convert the Cite extension to a token stream transformer. This required a few further additions to the TokenTransformDispatcher. In particular, there is now an 'any' token match whose callbacks are executed before more specific callbacks. This is used by the Cite extension to eat all tokens between ref and /ref tags. This need is very common, so should be broken out to an intermediate layer in the future. In general, the requirements for the TokenTransformDispatcher API are now clearer, and the API should likely be cleaned up / simplified.	2011-12-13 14:48:47 +00:00
Gabriel Wicke	a8fa9433c4	Convert quote handling (italic/bold) to a core extension operating on the token stream. This is the first token transformation exercising the TokenTransformer class as its dispatcher. Template expansions, wiki link formatting, tag sanitation and extensions should be able to use the same dispatcher by registering for specific token types. The parser performance is very slightly improved as the token stream is only traversed once.	2011-12-12 20:53:14 +00:00
Gabriel Wicke	d616f07a79	Don't re-build the wiki tokenizer for each test. This speeds up the full parserTests.js run slightly from 7-8 minutes to about 14 seconds ;) A few very minor tweaks to the grammar are also thrown into this commit.	2011-12-12 10:47:42 +00:00
Gabriel Wicke	c2b69e2486	Clean up newline handling. Emit a NEWLINE token for each non-{comment,pre,nowiki} newline.	2011-12-08 14:34:18 +00:00
Gabriel Wicke	abc2254110	A bit of comment clean-up and wrapping of tree building into try/catch block to actually count failures.	2011-12-08 11:40:59 +00:00
Gabriel Wicke	92fdf99384	Further renaming, this time from pegParser to pegTokenizer.	2011-12-08 10:59:44 +00:00

22 commits