wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-11-29 00:30:44 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	f662690d02	Shorten data-mw-rt to data-mw and clean up whitelist Instead of a proliferation of data-mw-* attributes, it should be easier to stash all private / non-semantic round-trip information in a JSON object stored in data-mw. Change-Id: Id200a6a8789fa152f29ea530e5a24b6ee7b4b285	2012-04-02 18:12:49 +02:00
Gabriel Wicke	5ef3438ee5	Change path to parserTests from phase3 to core after switch to git. Change-Id: Ie13f678eaa81447e98db5c8c394ab103caad8454	2012-04-02 17:10:06 +02:00
Audrey Tang	d3602bb459	* Get parser tests from GitWeb, not Subversion. Change-Id: I39f933b9e0320dc62736da07ce097ec1badec9aa	2012-03-28 23:39:01 +08:00
Antoine Musso	f637756319	node modules required: request & jshashes	2012-03-13 15:14:18 +00:00
Gabriel Wicke	af03eb4f29	Improve generic attribute expansion before external link processing, and make wgUploadPath configurable. Also change the hard-coded fall-back image sizes to sensible defaults. This breaks three parser tests until image size retrieval from the wiki is implemented.	2012-03-06 18:02:35 +00:00
Gabriel Wicke	227103e12c	Accept empty table cell attribute sections, and consider percent-encoded %2525 valid. 270 tests passing.	2012-03-06 14:32:45 +00:00
Gabriel Wicke	2efcd3cd57	Reworked percent encoding handling for URIs to get closer to the 'url construction' part of the HTML5 spec: http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#url-manipulation-and-creation Removed a few whitelisted test cases that are now passing directly. The encoding canonicalization could also be moved to the Sanitizer. Doing this early in token stream processing however has the advantage of providing further transformations uniform data to work with. We could even consider to move this even further into the tokenizer.	2012-03-06 13:49:37 +00:00
Gabriel Wicke	19fe9726a2	Fix invalid external link representation. 268 tests passing.	2012-03-05 18:06:29 +00:00
Gabriel Wicke	7b0c807710	Change wikilink tokenization strategy to split on pipes. This makes it possible to support template / template argument expansion in image options, and causes little trouble for wikilinks. Non-image wikilinks with multiple text pipes are quite rare in the dumps, and concatenating description tokens with a plain '\|' is quite easy. 261 parser tests passing.	2012-03-05 12:00:38 +00:00
Gabriel Wicke	009d7a4dea	Namespaces to the rescue.	2012-03-02 15:49:05 +00:00
Gabriel Wicke	fe681042c0	Collect some statistics while grepping.	2012-03-01 16:42:28 +00:00
Gabriel Wicke	e0838db315	Capturing the regexp is no longer necessary, and speeds up the grepper. Also tweaked the multi-line ISBN regexp slightly.	2012-02-29 13:02:46 +00:00
Gabriel Wicke	e3deb304db	Add a misc regexp file for dump grepping.	2012-02-29 11:07:17 +00:00
Gabriel Wicke	14f40aa7d5	Support capturing regexps in dumpGrepper.	2012-02-29 10:49:00 +00:00
Gabriel Wicke	ebcfc2c7a1	Improve grepper documentation.	2012-02-28 14:24:37 +00:00
Gabriel Wicke	b767e03449	Tweak martian regexp and grepper output format.	2012-02-28 14:11:44 +00:00
Gabriel Wicke	4806505ce4	Finish color highlighting for dump grepper / fix broken commit r112592.	2012-02-28 13:48:47 +00:00
Gabriel Wicke	7daeb34d4d	Implement onlyinclude transformer. 254 tests passing.	2012-02-28 13:21:01 +00:00
Gabriel Wicke	32012c00cd	Add martian-endtags regexp wrapper around dumpGrepper.	2012-02-27 16:51:20 +00:00
Gabriel Wicke	19c67c28a2	Add a simple dump grepper using DumpReader. Useful to inform parser design decisions, and as a way to exercise the dump reader in preparation for tests over full dumps.	2012-02-27 16:40:01 +00:00
Gabriel Wicke	21855c99cd	Tweak dumpReader to work with current libxmljs and stdin 'data' events.	2012-02-27 15:46:08 +00:00
Gabriel Wicke	2e41b19af8	Green two more parser tests by implementing some parser functions.	2012-02-22 16:39:50 +00:00
Gabriel Wicke	3568dfee14	Add some support for functionhooks in test parser and parserTests.js, and tweak a few parser functions.	2012-02-22 15:59:11 +00:00
au	f1fb937b4a	* Instead of sorting attributes, whitelist the one parserTest where it matters.	2012-02-20 22:26:24 +00:00
au	0ca9b00100	* Convert __patched-html-parser to .coffee. Note that the compiled .js file (generated by "make"/"make test") is still under version control so folks can work on the project even without a running "coffee" command in PATH. Also updated README to mention coffee-script and "make test".	2012-02-18 18:54:12 +00:00
au	4d1c6c7d6e	* Add a "make test" target that auto-fetches parserTests.txt.	2012-02-18 17:28:46 +00:00
au	0360e62da7	* Locally apply the HTML5.Marker.type patch. This is needed until https://github.com/aredridel/html5/issues/44 is merged into the upstream "html5" module.	2012-02-18 17:28:35 +00:00
Gabriel Wicke	025f9cddb3	Prefix all internal data- attributes with data-mw- and adjust the whitelist and test output normalization accordingly. 235 tests passing.	2012-02-13 13:54:07 +00:00
Gabriel Wicke	a122e51eec	Move data-* annotations into separate object on tokens, that is then serialized into a single data-mw-rt attribute if present. Update parserTests to ignore this attribute for comparisons with expected parser output. A few more tweaks and notes are thrown into this commit too. 233 tests are passing now.	2012-02-11 16:43:25 +00:00
Gabriel Wicke	1f6db903e9	Pluck a few low-hanging fruit in external link tokenization, and add a simple localurl parser function implementation. 230 parser tests now passing.	2012-02-07 10:28:23 +00:00
Gabriel Wicke	d321d96bab	Fix parserTests summary with filtering enabled	2012-02-07 09:27:47 +00:00
Gabriel Wicke	a5b7ea7bcd	Add --debug and --trace options to parserTests as well.	2012-02-01 17:02:37 +00:00
Gabriel Wicke	7cd94df47d	A few minor tweaks to reduce memory usage	2012-01-27 13:32:44 +00:00
Gabriel Wicke	348cac6cf0	Fix a bug in TokenCollector, and misc tweaks for template expansions.	2012-01-20 18:47:17 +00:00
Gabriel Wicke	2233d0a488	Eventify parser tests and parse.js commandline wrapper to actuallly allow async template fetching. Async expansion is not yet fully debugged, but at least the preconditions for that are now there.	2012-01-18 23:46:01 +00:00
Gabriel Wicke	34025251a3	Clean up 'END' token handling a bit.	2012-01-17 20:01:21 +00:00
Gabriel Wicke	f4081bef08	First template expansion tests start working, and a bug fix in DOMPostProcessor paragraph wrapper. 187 parser tests now passing.	2012-01-14 00:58:20 +00:00
Gabriel Wicke	5ec30252f1	More token transform and pipeline setup refactoring to support template expansion better.	2012-01-10 01:09:50 +00:00
Gabriel Wicke	2e35171fd1	Fix quote handling and tweak the whitelist a bit. 'any' token registrations are now merged with specific registrations by rank. Not yet clear if that is a good idea overall, need to check use cases when implementing template expansion and other functionality. 183 parser test now passing.	2012-01-04 14:09:05 +00:00
Gabriel Wicke	29362cc53c	Rename ParseThingy to ParserPipeline and fix up broken WikiDom generation and commandline runner.	2012-01-04 08:39:45 +00:00
Gabriel Wicke	bd98eb4c5a	Land big TokenTransformDispatcher and eventization refactoring. The TokenTransformDispatcher now actually implements an asynchronous, phased token transformation framework as described in https://www.mediawiki.org/wiki/Future/Parser_development/Token_stream_transformations. Additionally, the parser pipeline is now mostly held together using events. The tokenizer still emits a lame single events with all tokens, as block-level emission failed with scoping issues specific to the PEGJS parser generator. All stages clean up when receiving the end tokens, so that the full pipeline can be used for repeated parsing. The QuoteTransformer is not yet 100% fixed to work with the new interface, and the Cite extension is disabled for now pending adaptation. Bold-italic related tests are failing currently.	2012-01-03 18:44:31 +00:00
Gabriel Wicke	8e00a72d0a	Improvements to link trail handling, and two tweaks to the whitelist. 182 tests now passing. Link trails depend on language-dependent positive character classes in the PHP parser. These classes all seem to disallow punctuation implicitly and list differing plain text characters instead, so it might be possible to get away with identifying a common class of non-trail punctuation instead. This would help to keep the tokenizer independent of configurations, which is very desirable for caching and simplified external parsing.	2011-12-30 12:47:06 +00:00
Gabriel Wicke	b3a0270d69	Remove env and load grammar in tokenizer constructor. Re-add property hack to keep parserTests running for now. Really need a different pipeline for html serialization or a reference to the HTML DOM.	2011-12-28 17:04:16 +00:00
Neil Kandalgaonkar	8fbf36e63e	put add terminal token inside tokenize method (will pull it out again for streaming interface)	2011-12-28 01:37:15 +00:00
Neil Kandalgaonkar	6103646ec8	remove need to add newline at end of input	2011-12-28 01:37:11 +00:00
Neil Kandalgaonkar	4158f82d7e	refactor parser to ParseThingy in different module, can be invoked with command line utility parse.js	2011-12-28 01:37:06 +00:00
Neil Kandalgaonkar	aedc6751ae	made parseThingy, temp class for refactoring all thingies related to parsing	2011-12-28 01:36:58 +00:00
Neil Kandalgaonkar	5ff2b4d475	make peg src path outside of peg tokenizer	2011-12-28 01:36:50 +00:00
Neil Kandalgaonkar	962d1262fc	create tokenizer without need to modify namespace with PEG source	2011-12-28 01:36:36 +00:00
Gabriel Wicke	1c7fe0eb34	Refactor table productions to support table fragments in templates (table start / row / table end). The old productions are not deleted yet to make it easy to compare the output on more complex articles. 181 tests passing after adding two table tests with whitespace-only differences to the whitelist.	2011-12-22 11:43:55 +00:00

1 2 3

114 commits