wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-11-15 18:39:52 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	a122e51eec	Move data-* annotations into separate object on tokens, that is then serialized into a single data-mw-rt attribute if present. Update parserTests to ignore this attribute for comparisons with expected parser output. A few more tweaks and notes are thrown into this commit too. 233 tests are passing now.	2012-02-11 16:43:25 +00:00
Gabriel Wicke	1f6db903e9	Pluck a few low-hanging fruit in external link tokenization, and add a simple localurl parser function implementation. 230 parser tests now passing.	2012-02-07 10:28:23 +00:00
Gabriel Wicke	d321d96bab	Fix parserTests summary with filtering enabled	2012-02-07 09:27:47 +00:00
Gabriel Wicke	a5b7ea7bcd	Add --debug and --trace options to parserTests as well.	2012-02-01 17:02:37 +00:00
Gabriel Wicke	7cd94df47d	A few minor tweaks to reduce memory usage	2012-01-27 13:32:44 +00:00
Gabriel Wicke	348cac6cf0	Fix a bug in TokenCollector, and misc tweaks for template expansions.	2012-01-20 18:47:17 +00:00
Gabriel Wicke	2233d0a488	Eventify parser tests and parse.js commandline wrapper to actuallly allow async template fetching. Async expansion is not yet fully debugged, but at least the preconditions for that are now there.	2012-01-18 23:46:01 +00:00
Gabriel Wicke	34025251a3	Clean up 'END' token handling a bit.	2012-01-17 20:01:21 +00:00
Gabriel Wicke	f4081bef08	First template expansion tests start working, and a bug fix in DOMPostProcessor paragraph wrapper. 187 parser tests now passing.	2012-01-14 00:58:20 +00:00
Gabriel Wicke	5ec30252f1	More token transform and pipeline setup refactoring to support template expansion better.	2012-01-10 01:09:50 +00:00
Gabriel Wicke	2e35171fd1	Fix quote handling and tweak the whitelist a bit. 'any' token registrations are now merged with specific registrations by rank. Not yet clear if that is a good idea overall, need to check use cases when implementing template expansion and other functionality. 183 parser test now passing.	2012-01-04 14:09:05 +00:00
Gabriel Wicke	29362cc53c	Rename ParseThingy to ParserPipeline and fix up broken WikiDom generation and commandline runner.	2012-01-04 08:39:45 +00:00
Gabriel Wicke	bd98eb4c5a	Land big TokenTransformDispatcher and eventization refactoring. The TokenTransformDispatcher now actually implements an asynchronous, phased token transformation framework as described in https://www.mediawiki.org/wiki/Future/Parser_development/Token_stream_transformations. Additionally, the parser pipeline is now mostly held together using events. The tokenizer still emits a lame single events with all tokens, as block-level emission failed with scoping issues specific to the PEGJS parser generator. All stages clean up when receiving the end tokens, so that the full pipeline can be used for repeated parsing. The QuoteTransformer is not yet 100% fixed to work with the new interface, and the Cite extension is disabled for now pending adaptation. Bold-italic related tests are failing currently.	2012-01-03 18:44:31 +00:00
Gabriel Wicke	8e00a72d0a	Improvements to link trail handling, and two tweaks to the whitelist. 182 tests now passing. Link trails depend on language-dependent positive character classes in the PHP parser. These classes all seem to disallow punctuation implicitly and list differing plain text characters instead, so it might be possible to get away with identifying a common class of non-trail punctuation instead. This would help to keep the tokenizer independent of configurations, which is very desirable for caching and simplified external parsing.	2011-12-30 12:47:06 +00:00
Gabriel Wicke	b3a0270d69	Remove env and load grammar in tokenizer constructor. Re-add property hack to keep parserTests running for now. Really need a different pipeline for html serialization or a reference to the HTML DOM.	2011-12-28 17:04:16 +00:00
Neil Kandalgaonkar	8fbf36e63e	put add terminal token inside tokenize method (will pull it out again for streaming interface)	2011-12-28 01:37:15 +00:00
Neil Kandalgaonkar	6103646ec8	remove need to add newline at end of input	2011-12-28 01:37:11 +00:00
Neil Kandalgaonkar	4158f82d7e	refactor parser to ParseThingy in different module, can be invoked with command line utility parse.js	2011-12-28 01:37:06 +00:00
Neil Kandalgaonkar	aedc6751ae	made parseThingy, temp class for refactoring all thingies related to parsing	2011-12-28 01:36:58 +00:00
Neil Kandalgaonkar	5ff2b4d475	make peg src path outside of peg tokenizer	2011-12-28 01:36:50 +00:00
Neil Kandalgaonkar	962d1262fc	create tokenizer without need to modify namespace with PEG source	2011-12-28 01:36:36 +00:00
Gabriel Wicke	1c7fe0eb34	Refactor table productions to support table fragments in templates (table start / row / table end). The old productions are not deleted yet to make it easy to compare the output on more complex articles. 181 tests passing after adding two table tests with whitespace-only differences to the whitelist.	2011-12-22 11:43:55 +00:00
Gabriel Wicke	574abd9774	A collection of small bug fixes to the grammar, Cite, the Token format converter and the HTML DOM -> WikiDom converter. The tokenizer now digests all parserTests.	2011-12-14 23:38:46 +00:00
Gabriel Wicke	dc77d73ad5	Add ability to pass through JSON data to WikiDom in data-json-* attributes, and fix parser to actually parse the Barack Obama article except for one table with nested templates at the start-of-line.	2011-12-14 17:25:09 +00:00
Gabriel Wicke	a09aa4d599	Add rough HTML DOM to WikiDom conversion. You can see serialized WikiDom of parser tests using 'node parserTests.js --wikidom'.	2011-12-14 15:15:41 +00:00
Gabriel Wicke	5f80d30428	Clean up access to document and body after building the tree.	2011-12-14 09:40:49 +00:00
Gabriel Wicke	feee9ded9f	Convert the Cite extension to a token stream transformer. This required a few further additions to the TokenTransformDispatcher. In particular, there is now an 'any' token match whose callbacks are executed before more specific callbacks. This is used by the Cite extension to eat all tokens between ref and /ref tags. This need is very common, so should be broken out to an intermediate layer in the future. In general, the requirements for the TokenTransformDispatcher API are now clearer, and the API should likely be cleaned up / simplified.	2011-12-13 14:48:47 +00:00
Gabriel Wicke	c33f74d227	Follow-up to r106001: Fix typo spotted by Nikerabbit. Good catch!	2011-12-13 13:00:57 +00:00
Gabriel Wicke	8e55e79b67	Rename TokenTransformer to TokenTransformDispatcher.	2011-12-13 11:45:12 +00:00
Gabriel Wicke	815c63ba6c	Disabled es* inclusion for now as the serializers are not currently used, and the recent addition of references to window are not compatible with node.js.	2011-12-13 11:17:33 +00:00
Gabriel Wicke	dc70687ed0	Update README	2011-12-13 10:03:01 +00:00
Gabriel Wicke	a8fa9433c4	Convert quote handling (italic/bold) to a core extension operating on the token stream. This is the first token transformation exercising the TokenTransformer class as its dispatcher. Template expansions, wiki link formatting, tag sanitation and extensions should be able to use the same dispatcher by registering for specific token types. The parser performance is very slightly improved as the token stream is only traversed once.	2011-12-12 20:53:14 +00:00
Gabriel Wicke	752b0990b2	Refactor parserTests somewhat into a class-like structure, and wire up the TokenTransformer.	2011-12-12 14:03:54 +00:00
Gabriel Wicke	d616f07a79	Don't re-build the wiki tokenizer for each test. This speeds up the full parserTests.js run slightly from 7-8 minutes to about 14 seconds ;) A few very minor tweaks to the grammar are also thrown into this commit.	2011-12-12 10:47:42 +00:00
Gabriel Wicke	abc2254110	A bit of comment clean-up and wrapping of tree building into try/catch block to actually count failures.	2011-12-08 11:40:59 +00:00
Gabriel Wicke	92fdf99384	Further renaming, this time from pegParser to pegTokenizer.	2011-12-08 10:59:44 +00:00
Gabriel Wicke	76bc477038	Rename html5TokenEmitter to HTML5TreeBuilder, and the contained Tokenizer to TreeBuilder.	2011-12-08 10:37:18 +00:00
Gabriel Wicke	1d299f6aa9	Also print out options for failing tests.	2011-12-07 11:45:05 +00:00
Gabriel Wicke	0734fb24c5	Add a few more items to the whitelist	2011-12-07 11:44:38 +00:00
Gabriel Wicke	7e1585d360	Add empty tables to the whitelist (legal in HTML5). Also add one more functionally identical italic/bold/link permmutation on the whitelist.	2011-12-06 22:05:43 +00:00
Gabriel Wicke	1a5ffacc5c	Add slightly different but functionally identical italic/bold/link nesting to whitelist.	2011-12-06 16:45:19 +00:00
Gabriel Wicke	a922d595cf	Really minor: Add a newline after whitelist printout.	2011-12-06 13:16:43 +00:00
Gabriel Wicke	1bd3f8321e	Minor beautification of whitelist entry print-out header.	2011-12-06 12:35:32 +00:00
Gabriel Wicke	228fccd0c1	Strip toc and edit sections from expected html for now.	2011-12-06 11:39:53 +00:00
Antoine Musso	350d1e8978	util.inspect to dump tokens It gets a better output over JSON.stringify since inspect nicely indent the object/array dump. Makes it easier to read for humans.	2011-12-06 10:23:58 +00:00
Gabriel Wicke	33e19f7275	Recognize block-level elements independent of case; Ignore toc and section edit links in tests. 148 parser tests passing.	2011-12-05 20:03:24 +00:00
Gabriel Wicke	a6867d76c5	Ignore missing redlink for now, we are concerned with the parser and not a complete wiki at this stage.	2011-12-05 17:07:06 +00:00
Gabriel Wicke	1760210d13	Fixes to tables, headings and misc smaller stuff. Tracked down an issue caused by improperly caching of production results, which interfered with the flag-dependent inline_break production.	2011-12-04 19:23:24 +00:00
Antoine Musso	7ead617a2e	--cache to save the test cases parsing This is optional but speed up launchtime when other files are not modified.	2011-12-01 17:51:07 +00:00
Antoine Musso	c21a81ee45	warn on invalid regex passed to --filter	2011-12-01 15:45:40 +00:00
Gabriel Wicke	63c728924b	Use pegjs from npm	2011-12-01 15:23:23 +00:00
Gabriel Wicke	d00743ad79	Improve external links and definition lists, now 133 tests passing ;) Also add printwhitelist option to test runner, provides js code copy/pastable to whitelist.	2011-12-01 14:25:59 +00:00
Antoine Musso	cb682c5ade	option to disable color output (use --no-color )	2011-12-01 12:30:15 +00:00
Gabriel Wicke	5d50c6bbf3	Follow-up to r104845: s/args/argv	2011-12-01 12:10:43 +00:00
Gabriel Wicke	edf40c616c	Make whitelist usage an option; tweak comment a bit	2011-12-01 11:47:22 +00:00
Gabriel Wicke	5f72acec8f	Add option to disable whitelist	2011-12-01 11:08:05 +00:00
Gabriel Wicke	35efed6634	Add a parser test whitelist for manually-checked tests, and an option to print JSON-serialized parser output for failing tests, which can then be added to the whitelist if appropriate.	2011-12-01 10:58:12 +00:00
Gabriel Wicke	e7f182d786	Strip the patch header lines, don't really need those	2011-11-30 18:21:53 +00:00
Antoine Musso	2b6d1896cb	colorize numbers in test summary Also added Brion's ALL TEST PASSED when it makes sense	2011-11-30 17:43:54 +00:00
Antoine Musso	ed74636ab5	--quick Suppress diff output of failed tests A long block of code was not reindented to make this patch easier to read for people not ignoring white spaces changes :D	2011-11-30 17:18:24 +00:00
Antoine Musso	ebfc6f08fd	--quiet suppress notification of passed tests --no-quiet will make sure you always see PASSING tests :)	2011-11-30 17:10:07 +00:00
Antoine Musso	3038df313f	allow test filtering using a regexp on title test use --filter :)	2011-11-30 17:03:29 +00:00
Antoine Musso	513b2e85b7	proper argument handling (requires 'optimist' module) Handle arguments and options properly by using the 'optimist' node module. Please note wordwrapping in usage does not seem to work on my setup :( Only --help implemented yet. Example: $ node parserTests.js --help Starting up JS parser tests Usage: node ./parserTests.js Options: --filter, --regex Only run tests whose descriptions which match given regex (option not implemented) --help, -h Show this help message --disabled Run disabled tests (default false) (option not implemented) [boolean]	2011-11-30 16:33:26 +00:00
Antoine Musso	567ef896e7	add some console messages	2011-11-30 15:33:56 +00:00
Antoine Musso	302e1519b3	add colors to visual editor parser testing TODO: add an option to switch color scheme for light/dark backgrounds	2011-11-30 15:20:46 +00:00
Gabriel Wicke	f758894de7	Let another test pass by swapping the default order of italic/bold for '''''. Minor test output cosmetics.	2011-11-30 13:54:57 +00:00
Gabriel Wicke	2bb512a4de	A bit of tokenizer grammar clean-up and additional expected-html normalization. 99 parser tests now passing.	2011-11-30 13:40:17 +00:00
Gabriel Wicke	ae0b5f9af4	* Split paragraph handling between tokenizer and DOM postprocessor for better html markup handling. * Remove global 'use strict' declarations from html5 parser. * Add trailing whitespace handling in dt Overall, 55 parser tests are now passing.	2011-11-29 15:11:51 +00:00
Gabriel Wicke	d7537d9777	Improve comment and general data-* attribute normalization.	2011-11-28 16:55:50 +00:00
Gabriel Wicke	1c91daa7be	Provide a summary of failures.	2011-11-28 14:53:07 +00:00
Gabriel Wicke	a875597530	Keep going of the HTML parser fails to normalize the expected test outcome. Minor code simplification, and recognition of tr, td and tbody as block-level elements.	2011-11-28 14:00:14 +00:00
Antoine Musso	9e887cc34c	Adds a path fallback to find test file. I do not fetch mediawiki in ../../../../phase3 . This patch use another path as a fallback.	2011-11-28 11:41:47 +00:00
Antoine Musso	901b0a8911	list some more needed node module	2011-11-28 11:40:14 +00:00
Gabriel Wicke	901a089358	Shorten diff output and display comments before each failing test.	2011-11-28 11:38:48 +00:00
Gabriel Wicke	5c2a145bdf	Add diff output as well.	2011-11-28 11:19:50 +00:00
Gabriel Wicke	d3f0196df7	Add primitive HTML comparison to detect passing parser tests. The expected HTML is parsed using a HTML parser and re-serialized, and the output compared to the serialization of the new parser's dom. Newline normalization is a cheap hack for now, need to improve that later.	2011-11-28 11:10:39 +00:00
Gabriel Wicke	dd5cd59ac6	Better HTML, pre and blocklevel handling. Hackish source formatting for easier comparison with parserTest results.	2011-11-25 12:47:03 +00:00
Gabriel Wicke	dee262658f	Add MediaWiki-compatible quote handling including quirks and overlapped structures like ''[[Link\|Link text'']]. This is another transform on the token stream.	2011-11-24 13:56:30 +00:00
Gabriel Wicke	8def550629	Fix parserTests path for full svn checkout	2011-11-22 12:32:34 +00:00
Gabriel Wicke	d1b0293569	Fix comment token conversion and serialization	2011-11-21 09:22:30 +00:00
Gabriel Wicke	b750ce38b8	Add node.js-compatible HTML5 parser and hook it up to the PEG tokenizer. Builds a DOM tree (jsdom) from the tokens and then serializes that using document.innerHTML. This is all very experimental, so don't be surprised by rough edges.	2011-11-18 13:57:07 +00:00
Gabriel Wicke	ea87e7aaee	Convert PEG parser to tokenizer for back-end HTML parser. Now emits a list of tokens, which for now is still completely built before parsing can proceed. For each top-level block, the source start/end positions are added as attributes to the top-most tokens. No tracking of wiki vs. html syntax yet.	2011-11-17 15:26:02 +00:00
Gabriel Wicke	06ca9f12fe	Rename definitiondata to definitiondescription, minor fixes	2011-11-04 12:25:01 +00:00
Gabriel Wicke	63398b5749	Update parserTests to latest serializers	2011-11-04 07:45:05 +00:00
Gabriel Wicke	a8838dab18	Start by handling paragraphs, at least a bit.	2011-11-03 15:16:05 +00:00
Gabriel Wicke	0d30a5528e	First combination of WikiDom serializers with existing parser in tests/parser/parserTests.js. * Removed var from es in es.js to allow node.js to access it as global. Only alternative solution appears to be a node-specific 'exports' construct: http://nodejs.org/docs/v0.3.1/api/modules.html * Added es.Document.js and es.Document.Serializer.js in es/bases. Not sure if this is the desired location. * Changed es.extend to es.extendClass in the serializers * Modified the first parser test to include the WikiDom modules and call the new HTML serializer	2011-11-03 13:55:48 +00:00

1 2 3

136 commits