wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-11-15 18:39:52 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	9945175416	Reformat Date.replaceChars	2012-02-13 14:23:48 +00:00
Gabriel Wicke	0b40741e1c	Strip trailing newlines from included templates	2012-02-13 14:17:03 +00:00
Gabriel Wicke	025f9cddb3	Prefix all internal data- attributes with data-mw- and adjust the whitelist and test output normalization accordingly. 235 tests passing.	2012-02-13 13:54:07 +00:00
Gabriel Wicke	b1617b1d71	Add some support for ideographic spaces in external links, support the int: namespace alias and perform some normalization on the MediaWiki namespace prefix.	2012-02-13 13:35:46 +00:00
Gabriel Wicke	55ddb4fd66	Remove WikiDom default serialization and --html argument from parse.js wrapper. HTML ist now the only supported format. The DOMConverter is now no longer used. Roan, feel free to remove / butcher it for direct HTML to linear model conversion.	2012-02-11 17:59:17 +00:00
Gabriel Wicke	a122e51eec	Move data-* annotations into separate object on tokens, that is then serialized into a single data-mw-rt attribute if present. Update parserTests to ignore this attribute for comparisons with expected parser output. A few more tweaks and notes are thrown into this commit too. 233 tests are passing now.	2012-02-11 16:43:25 +00:00
Gabriel Wicke	aff30be131	Some comments and reshuffling in the grammar, and a typo in the AttributeExpander.	2012-02-09 22:27:45 +00:00
Gabriel Wicke	6e33255503	Improve support for preprocessor functionality in attributes; Support multi-line xmlish tags with preprocessor stuff in attributes.	2012-02-09 16:36:29 +00:00
Gabriel Wicke	16ded7d955	Fix a bug in wikilink with trail tokenization.	2012-02-09 14:06:35 +00:00
Gabriel Wicke	6983481561	Move attribute expansion back to separate handler, as this makes it easier to only expand used branches selected by parser functions. Template (and -argument) expansion is simply registered before general expansion. Additionally, a few more simple time-based magic words are added in ParserFunctions.	2012-02-09 13:44:20 +00:00
Gabriel Wicke	3f7c1499cd	Enable support for general preprocessor functionality in attribute keys and values. This includes comments, templates and template arguments. This also replaces the specialized expansion logic in the TemplateHandler. The removal of link validation lets one more parser test fail for now. External link target validation will need to be implemented in the token stream handler for links. This is noted as TODO in https://www.mediawiki.org/wiki/Future/Parser_development#Token_stream_transforms.	2012-02-08 15:10:30 +00:00
Gabriel Wicke	157c495a9e	Normalize the title in localurl. 232 tests passing.	2012-02-07 12:26:00 +00:00
Gabriel Wicke	b4892102a4	Clean up transform callback interface	2012-02-07 11:53:29 +00:00
Gabriel Wicke	1f6db903e9	Pluck a few low-hanging fruit in external link tokenization, and add a simple localurl parser function implementation. 230 parser tests now passing.	2012-02-07 10:28:23 +00:00
Gabriel Wicke	cf8b7bf45d	External links don't nest.	2012-02-07 09:38:28 +00:00
Gabriel Wicke	53bf4f2bd0	Temporarily disable the sanitizer and start to support preprocessor functionality (comments, templates, template arguments) in arbitrary attributes. The grammar for this is still quite rough, will need to consolidate that area.	2012-02-06 19:15:44 +00:00
Gabriel Wicke	c26243989e	Improve toJSON handlers to include all properties	2012-02-06 19:12:29 +00:00
Gabriel Wicke	0bea9fdfbb	Fix nowiki tokenization regression introduced r110495	2012-02-03 13:10:04 +00:00
Gabriel Wicke	26f2026cff	Add custom JSON serializers for tokens that include a type attribute	2012-02-03 13:09:01 +00:00
Gabriel Wicke	8c75aa1a7a	Remove type attribute for tag tokens.	2012-02-01 18:37:48 +00:00
Gabriel Wicke	689f697a93	Push token format conversion a bit further along, and add defines that were missing in last commit.	2012-02-01 17:03:08 +00:00
Gabriel Wicke	a5cc10a06b	Change token format to plain strings for text tokens, and specific objects for other tokens. This is only the first half of the conversion. The next step is to drop the type attribute on most tokens and match on the constructor in the token transform machinery.	2012-02-01 16:30:43 +00:00
Gabriel Wicke	dd3707ded5	Remove some modules normally bundled with node.js from dependencies, and remove some older ones that are only used in currently-dead code.	2012-02-01 10:32:33 +00:00
Gabriel Wicke	e65c6502c0	Add source for #time implementation in comment	2012-02-01 10:14:01 +00:00
Gabriel Wicke	14a8a13678	A few more debug helpers including a --trace mode for light debugging. Some improvements to parser functions on the way to support the cite extensions. Preparation for generic template and template arg in attribute support. 222 parser tests now passing.	2012-01-31 16:50:16 +00:00
Neil Kandalgaonkar	2688f823ef	added dependencies to README	2012-01-31 00:56:07 +00:00
Neil Kandalgaonkar	f0b934ef2e	first pass at an API method that returns wikidom. Shells out to node. Some issues with XML API result formatting but works fine in JSON	2012-01-31 00:02:48 +00:00
Gabriel Wicke	7cd94df47d	A few minor tweaks to reduce memory usage	2012-01-27 13:32:44 +00:00
Gabriel Wicke	4e6a54560a	* Emit token chunks for top-level block elements by patching the source of the tokenizer * Fix a bug uncovered by this * Increase the number of outstanding listeners on a single download to 10000	2012-01-22 23:21:53 +00:00
Gabriel Wicke	7ea4d7d3db	A few parser function fixes and maximum template expansion in environment config.	2012-01-22 19:32:28 +00:00
Gabriel Wicke	561cf3c237	Bug fixes and a first stab at a #time parser function. You can expand the main page like this: cd extensions/VisualEditor/modules/parser echo '{{:Main Page}}' \| node parse.js echo '{{:Main Page}}' \| node parse.js --html echo '{{:Main Page}}' \| node parse.js --debug Even the date-based includes work somewhat, although they don't yet accept passed-in dates.	2012-01-22 07:07:16 +00:00
Gabriel Wicke	60e45bb739	A bit of template expansion bug fixing and parser function documentation	2012-01-22 01:27:22 +00:00
Gabriel Wicke	e8a7034acf	Add some commandline switches to parse.js. Supports switching on/off debug mode and a selection of html/WikiDom serialization.	2012-01-21 22:42:54 +00:00
Gabriel Wicke	785a4af76f	Implement a few parser functions. 220 parser tests now passing.	2012-01-21 20:38:13 +00:00
Gabriel Wicke	1a6546fbca	Support empty template arguments and default values in arg expansion	2012-01-21 03:03:33 +00:00
Gabriel Wicke	fdd048b3b2	Remove a few stray debug prints and disable debugging in parse.js	2012-01-20 22:21:33 +00:00
Gabriel Wicke	145df2655c	* NoInclude and IncludeOnly improvements * Tokenizer support for templates and template args in template arguments and titles * Async attribute expansion fixes	2012-01-20 22:02:23 +00:00
Gabriel Wicke	348cac6cf0	Fix a bug in TokenCollector, and misc tweaks for template expansions.	2012-01-20 18:47:17 +00:00
Gabriel Wicke	7cc8e69147	Collapse all requests per template into a single outstanding request using an event-emitting TemplateRequest object and a request queue.	2012-01-20 02:36:18 +00:00
Gabriel Wicke	fc2088bb21	Add some rudimentary noinclude / includeonly support and fix up TokenCollector.	2012-01-20 01:46:16 +00:00
Gabriel Wicke	c15e0d4167	Minor cleanup in TemplateHandler	2012-01-20 00:49:27 +00:00
Gabriel Wicke	d0ece16c86	Fix async template expansion, so we can now render simple pages with templates directly to WikiDom from enwiki using a commandline like this: echo '{{User:GWicke/Test}}' \| node parse.js Wohoo! Complex pages with templates won't render properly yet, as noinclude / includeonly and parser functions are not yet implemented. As a result, the parser will run out of memory or hit the currently low expansion depth limit as it tries to expand documentation for all templates.	2012-01-19 23:43:39 +00:00
Gabriel Wicke	2233d0a488	Eventify parser tests and parse.js commandline wrapper to actuallly allow async template fetching. Async expansion is not yet fully debugged, but at least the preconditions for that are now there.	2012-01-18 23:46:01 +00:00
Gabriel Wicke	5b8054636e	Make template fetching somewhat functional on node with Inez' help, but disable it by default in parserTests as it tries to fetch all sorts of parser functions and is not yet fully supported in parserTests. The next step will be to build a list of parser functions (to avoid fetching them as templates) and pushing the event interface into parserTests.	2012-01-18 19:38:32 +00:00
Gabriel Wicke	4bd4307924	Fix comment to reflect the actual regexp/spec in the JS version as well.	2012-01-18 19:35:13 +00:00
Gabriel Wicke	14e6728cc4	Add the start of a minimal sanitizer stage, that only strips IDN ignored characters from host portions of links hrefs for now. This module needs to be filled up with pretty much everything Sanitizer.php does, including tag and attribute whitelists and attribute value sanitation (especially for style attributes). We'll also need to think about round-tripping of sanitized tokens.	2012-01-18 01:42:56 +00:00
Gabriel Wicke	336be4f617	Eat '[[[' as plain text token, makes it 212 passing.	2012-01-18 00:23:17 +00:00
Gabriel Wicke	178adbc342	Accept IPv6 (and IPv4) addresses in the tokenizer, so another test passes.	2012-01-18 00:00:47 +00:00
Gabriel Wicke	e7381da5b8	Trim whitespace off template titles and argument names. 209 parser tests now passing.	2012-01-17 23:18:33 +00:00
Gabriel Wicke	f50fecf1e3	Fix template argument expansion. 200 parser tests now passing.	2012-01-17 22:29:26 +00:00
Gabriel Wicke	34025251a3	Clean up 'END' token handling a bit.	2012-01-17 20:01:21 +00:00
Gabriel Wicke	7f579398c7	Use isBlockTag in DOMPostProcessor	2012-01-17 18:30:22 +00:00
Gabriel Wicke	6bd7ca1e75	Misc improvements, now 196 parser tests passing. * Add handler for post-expand paragraph wrapping on token stream, to handle things like comments on its own line post-expand * Add general Util module * Fix self-closing tag handling in HTML5 tree builder	2012-01-17 18:22:10 +00:00
Gabriel Wicke	f4081bef08	First template expansion tests start working, and a bug fix in DOMPostProcessor paragraph wrapper. 187 parser tests now passing.	2012-01-14 00:58:20 +00:00
Gabriel Wicke	196d704e8e	Template expansion now enabled and somewhat working, but template fetching still fails all the time.	2012-01-13 18:48:25 +00:00
Gabriel Wicke	32c9bccd7c	Results of early template expansion debugging. Still disabled by default, but getting closer.	2012-01-11 19:48:49 +00:00
Gabriel Wicke	6b6ec2933d	More work towards template expansion. * Created AttributeTokenTransformManager for generic attribute conversion, and removed { title, template argument {key, value} } expansion from TemplateHandler. * Added caching for attribute and input sub-pipelines. Especially attribute pipelines would otherwise be recreated for each attribute value and key.	2012-01-11 00:05:51 +00:00
Gabriel Wicke	5ec30252f1	More token transform and pipeline setup refactoring to support template expansion better.	2012-01-10 01:09:50 +00:00
Gabriel Wicke	287604c422	A bit of cleanup in ParserPipeline, with better and more consistent support for multiple input types.	2012-01-09 19:33:49 +00:00
Gabriel Wicke	becf3cb7ea	Add generic 'collect all tokens between delimiter tokens and call a transform function on it' util for synchronous transformation phases. This can be used to implement parser hooks (aka extension tags) besides other things.	2012-01-09 18:13:45 +00:00
Gabriel Wicke	e99d7a2a55	Two batteries worth of token transform manager refactoring. * TokenTransformDispatcher is now renamed to TokenTransformManager, and is also turned into a base class * SyncTokenTransformManager and AsyncTokenTransformManager subclass TokenTransformManager and implement synchronous (phase 1,3) and asynchronous (phase 2) transformation stages. * Communication between stages uses the same chunk / end events as all the other token stages. * The AsyncTokenTransformManager now supports the creation of nested AsyncTokenTransformManagers for template expansion. The AsyncTokenTransformManager object takes on the responsibilities of a preprocessor frame. Transforms are newly created (or potentially resurrected from a cache), so that transforms do not have to worry about concurrency. * The environment is pushed through to all transform managers and the individual transforms.	2012-01-09 17:49:16 +00:00
Gabriel Wicke	6601c544e6	Handle default for template arg expansion, add template fetch functionality and tweak a few minor things in the grammar and QuoteTransformer.	2012-01-06 17:19:14 +00:00
Gabriel Wicke	f0c844f28f	Add template expansion handler skeleton, not yet functional. Also note improvements needed in the tokenizer template handling.	2012-01-06 14:30:55 +00:00
Gabriel Wicke	2e35171fd1	Fix quote handling and tweak the whitelist a bit. 'any' token registrations are now merged with specific registrations by rank. Not yet clear if that is a good idea overall, need to check use cases when implementing template expansion and other functionality. 183 parser test now passing.	2012-01-04 14:09:05 +00:00
Gabriel Wicke	6cd95fea37	Fix up constructors in EventEmitter inheritance and tweak a few more comments.	2012-01-04 12:28:41 +00:00
Gabriel Wicke	e3ae9a702b	Fix JSHint warnings (mostly about comment indentation) from r108012.	2012-01-04 11:06:24 +00:00
Gabriel Wicke	4c4a24f0a0	Hook up the DOMPostProcessor using events as well, and rename the subscription methods to tell a story. Also document idea on how to dynamically configure the pipeline depending on event registrations in comment.	2012-01-04 11:00:54 +00:00
Gabriel Wicke	f0399d2ec5	Clean up comments in TokenTransformDispatcher and mark private methods with underscore.	2012-01-04 09:48:24 +00:00
Gabriel Wicke	ee79158e53	Add trailing newline in commandline parser wrapper	2012-01-04 08:42:53 +00:00
Gabriel Wicke	29362cc53c	Rename ParseThingy to ParserPipeline and fix up broken WikiDom generation and commandline runner.	2012-01-04 08:39:45 +00:00
Gabriel Wicke	bd98eb4c5a	Land big TokenTransformDispatcher and eventization refactoring. The TokenTransformDispatcher now actually implements an asynchronous, phased token transformation framework as described in https://www.mediawiki.org/wiki/Future/Parser_development/Token_stream_transformations. Additionally, the parser pipeline is now mostly held together using events. The tokenizer still emits a lame single events with all tokens, as block-level emission failed with scoping issues specific to the PEGJS parser generator. All stages clean up when receiving the end tokens, so that the full pipeline can be used for repeated parsing. The QuoteTransformer is not yet 100% fixed to work with the new interface, and the Cite extension is disabled for now pending adaptation. Bold-italic related tests are failing currently.	2012-01-03 18:44:31 +00:00
Neil Kandalgaonkar	20374b5911	fix substr for IE, followup r107464	2011-12-30 21:51:03 +00:00
Gabriel Wicke	8e00a72d0a	Improvements to link trail handling, and two tweaks to the whitelist. 182 tests now passing. Link trails depend on language-dependent positive character classes in the PHP parser. These classes all seem to disallow punctuation implicitly and list differing plain text characters instead, so it might be possible to get away with identifying a common class of non-trail punctuation instead. This would help to keep the tokenizer independent of configurations, which is very desirable for caching and simplified external parsing.	2011-12-30 12:47:06 +00:00
Gabriel Wicke	11ece76b7b	Fix suffix handling for wiki links.	2011-12-30 09:35:57 +00:00
Gabriel Wicke	b3a0270d69	Remove env and load grammar in tokenizer constructor. Re-add property hack to keep parserTests running for now. Really need a different pipeline for html serialization or a reference to the HTML DOM.	2011-12-28 17:04:16 +00:00
Gabriel Wicke	3a63fb118e	Add a few comments inline, and remove unneeded html serialization as we are only interested in WikiDom output in this parser wrapper.	2011-12-28 13:46:52 +00:00
Neil Kandalgaonkar	8fbf36e63e	put add terminal token inside tokenize method (will pull it out again for streaming interface)	2011-12-28 01:37:15 +00:00
Neil Kandalgaonkar	6103646ec8	remove need to add newline at end of input	2011-12-28 01:37:11 +00:00
Neil Kandalgaonkar	4158f82d7e	refactor parser to ParseThingy in different module, can be invoked with command line utility parse.js	2011-12-28 01:37:06 +00:00
Neil Kandalgaonkar	d91a67ba99	nodeName not defined	2011-12-28 01:36:54 +00:00
Neil Kandalgaonkar	962d1262fc	create tokenizer without need to modify namespace with PEG source	2011-12-28 01:36:36 +00:00
Gabriel Wicke	33e60dd4d9	Update comments a bit.	2011-12-22 12:37:24 +00:00
Gabriel Wicke	9ee0e660ec	Fix regression introduced by r107060 for regular table cells. Good to have a test suite ;)	2011-12-22 12:09:25 +00:00
Gabriel Wicke	a94d0ec10c	Re-add support for row-only tables.	2011-12-22 11:58:32 +00:00
Gabriel Wicke	1c7fe0eb34	Refactor table productions to support table fragments in templates (table start / row / table end). The old productions are not deleted yet to make it easy to compare the output on more complex articles. 181 tests passing after adding two table tests with whitespace-only differences to the whitelist.	2011-12-22 11:43:55 +00:00
Gabriel Wicke	2845ba9552	Handle noinclude and includeonly at start of line, so that syntax after it still matches as if it actually was preceded by a newline.	2011-12-21 11:38:50 +00:00
Gabriel Wicke	3a631db6d9	Fix ranges for annotations in implicit paragraphs within branch nodes.	2011-12-16 19:36:04 +00:00
Gabriel Wicke	cc06551f2e	Rename table_header production to table_heading. Those non-natives strike again.	2011-12-16 19:24:59 +00:00
Gabriel Wicke	605ed23fd2	Fix attributes in table headings.	2011-12-16 19:22:13 +00:00
Gabriel Wicke	08255ff3e6	Small bug fix to heading level, spotted by Mike from localwiki- thanks!	2011-12-15 23:59:35 +00:00
Gabriel Wicke	a04744b2ec	Add some more attribute remapping capabilities to the DOMConverter, and clean up some grammar formatting.	2011-12-15 17:33:07 +00:00
Gabriel Wicke	e98dd9e722	Implement 1-char-minimum width for annotations, and some additonal minor cleanup.	2011-12-15 11:05:52 +00:00
Gabriel Wicke	22ba27295b	Clean up the DOMConverter a bit.	2011-12-15 10:55:30 +00:00
Gabriel Wicke	e72dee76e4	Follow-up to r106208 and r106207. Both good catches, thanks Yair! As this code is in its early stages and nowhere near deployment, please Be Bold and just commit things like this directly! IMHO it makes more sense to fully review this once it settles down a bit.	2011-12-15 10:13:50 +00:00
Gabriel Wicke	3585bd9c8e	Accept row-only tables. The parser now eats [[en:Barack Obama]] as-is. Hooray!	2011-12-15 00:39:28 +00:00
Gabriel Wicke	6df94a34a1	Less lust for urls	2011-12-15 00:26:22 +00:00
Gabriel Wicke	ce2ee067f7	Minor tweak to wiki link production	2011-12-15 00:12:58 +00:00
Gabriel Wicke	377226a120	Comment out a stray console.log	2011-12-14 23:44:58 +00:00
Gabriel Wicke	574abd9774	A collection of small bug fixes to the grammar, Cite, the Token format converter and the HTML DOM -> WikiDom converter. The tokenizer now digests all parserTests.	2011-12-14 23:38:46 +00:00
Gabriel Wicke	dc77d73ad5	Add ability to pass through JSON data to WikiDom in data-json-* attributes, and fix parser to actually parse the Barack Obama article except for one table with nested templates at the start-of-line.	2011-12-14 17:25:09 +00:00
Gabriel Wicke	f6e4267fca	Handle a few more element types, and reset offset for each leaf node. Not sure if the latter is correct, as the documentation at https://www.mediawiki.org/wiki/Visual_editor/Software_design#Data_Structures and the actual sample WikiDom in the editor sandbox seem to disagree on this point.	2011-12-14 16:22:27 +00:00
Gabriel Wicke	6676a47008	Add implicit level attribute to WikiDom headings.	2011-12-14 15:55:58 +00:00
Gabriel Wicke	3018ca690b	Improve WikiDom conversion: Handle text and annotations in branch nodes as paragraphs and treat list items as branches.	2011-12-14 15:40:40 +00:00
Gabriel Wicke	a09aa4d599	Add rough HTML DOM to WikiDom conversion. You can see serialized WikiDom of parser tests using 'node parserTests.js --wikidom'.	2011-12-14 15:15:41 +00:00
Gabriel Wicke	5f80d30428	Clean up access to document and body after building the tree.	2011-12-14 09:40:49 +00:00
Gabriel Wicke	30749b8d8d	Update comments a bit and add a note on things to improve in API.	2011-12-14 09:33:25 +00:00
Gabriel Wicke	55ff272847	Comment TokenTransformDispatcher.	2011-12-13 20:13:09 +00:00
Gabriel Wicke	44deefe303	Minor tweak to comment.	2011-12-13 18:55:44 +00:00
Gabriel Wicke	c61b32eaa7	Clean up and comment the Cite extension a bit.	2011-12-13 18:45:09 +00:00
Gabriel Wicke	feee9ded9f	Convert the Cite extension to a token stream transformer. This required a few further additions to the TokenTransformDispatcher. In particular, there is now an 'any' token match whose callbacks are executed before more specific callbacks. This is used by the Cite extension to eat all tokens between ref and /ref tags. This need is very common, so should be broken out to an intermediate layer in the future. In general, the requirements for the TokenTransformDispatcher API are now clearer, and the API should likely be cleaned up / simplified.	2011-12-13 14:48:47 +00:00
Gabriel Wicke	8e55e79b67	Rename TokenTransformer to TokenTransformDispatcher.	2011-12-13 11:45:12 +00:00
Gabriel Wicke	8231511217	Replace custom object copy with $.extend.	2011-12-13 11:18:15 +00:00
Gabriel Wicke	39aedd4378	Improve comments in QuoteTransformer.	2011-12-13 10:25:18 +00:00
Gabriel Wicke	0ad08b9ae3	Add a README file pointing to the wiki documentation.	2011-12-12 22:30:11 +00:00
Gabriel Wicke	a8fa9433c4	Convert quote handling (italic/bold) to a core extension operating on the token stream. This is the first token transformation exercising the TokenTransformer class as its dispatcher. Template expansions, wiki link formatting, tag sanitation and extensions should be able to use the same dispatcher by registering for specific token types. The parser performance is very slightly improved as the token stream is only traversed once.	2011-12-12 20:53:14 +00:00
Gabriel Wicke	752b0990b2	Refactor parserTests somewhat into a class-like structure, and wire up the TokenTransformer.	2011-12-12 14:03:54 +00:00
Gabriel Wicke	d616f07a79	Don't re-build the wiki tokenizer for each test. This speeds up the full parserTests.js run slightly from 7-8 minutes to about 14 seconds ;) A few very minor tweaks to the grammar are also thrown into this commit.	2011-12-12 10:47:42 +00:00
Gabriel Wicke	89c5e0cafb	Follow-up to r105859: Add missing new.	2011-12-12 10:09:13 +00:00
Gabriel Wicke	9ebce5839a	Further development of the TokenTransformer framework.	2011-12-12 10:01:47 +00:00
Gabriel Wicke	80d5067813	Add a TokenTransformer dispatcher class. This class provides subscriptions by token type, and supports asynchronous token expansion (for example for async template expansion). This code is not yet tested or used. The interface for token insertion from transformation functions will be expanded as needed.	2011-12-08 14:37:31 +00:00
Gabriel Wicke	c2b69e2486	Clean up newline handling. Emit a NEWLINE token for each non-{comment,pre,nowiki} newline.	2011-12-08 14:34:18 +00:00
Gabriel Wicke	abc2254110	A bit of comment clean-up and wrapping of tree building into try/catch block to actually count failures.	2011-12-08 11:40:59 +00:00
Gabriel Wicke	92fdf99384	Further renaming, this time from pegParser to pegTokenizer.	2011-12-08 10:59:44 +00:00
Gabriel Wicke	76bc477038	Rename html5TokenEmitter to HTML5TreeBuilder, and the contained Tokenizer to TreeBuilder.	2011-12-08 10:37:18 +00:00
Gabriel Wicke	19a1f0850f	Tidy up the grammar a bit.	2011-12-08 10:33:23 +00:00
Gabriel Wicke	3742d70abd	Add some documentation to syntax flags	2011-12-07 15:54:55 +00:00
Gabriel Wicke	545ca1809f	Convert template argument production to generic inline with syntactic stop. Fix a bug in generic inline production. Nested multi-line templates are now parsed okayish.	2011-12-07 15:39:39 +00:00
Gabriel Wicke	902db40a1f	Process template arguments into an object.	2011-12-07 14:46:07 +00:00
Gabriel Wicke	51a40e4dbc	Follow-up to r105423: Fix off-by-one bug.	2011-12-07 11:56:12 +00:00
Gabriel Wicke	49c286a67b	Fix a bug in doQuotes (bitten by surprising JS sort() behavior), and improve tag-only-line handling. 180 parser tests now passing.	2011-12-07 11:51:24 +00:00
Gabriel Wicke	418a5067c6	Parse attributes in tables using generic attribute production. Some table tests still do not pass as the MW table output reorders attributes ;)	2011-12-06 22:03:21 +00:00
Gabriel Wicke	3d06707152	Slightly speed up inline tag productions using guards and grouping; Fix list processing function.	2011-12-06 18:35:05 +00:00
Gabriel Wicke	ea8f226fd5	Remove ext and references special cases, now subsumed by generic XML tag productions. Document issue around special tokenizer mode for other extension tags.	2011-12-06 16:44:27 +00:00
Gabriel Wicke	e7de089d5b	Decode urls and html entities, 163 tests now passing.	2011-12-06 13:17:14 +00:00
Gabriel Wicke	a72a9e55a3	Don't match internal links with url as target. 161 passing.	2011-12-06 12:26:57 +00:00
Gabriel Wicke	2b5cc67bf5	Further tweaks to headings. 157 tests now passing.	2011-12-06 11:59:41 +00:00
Gabriel Wicke	f4d123886e	Convert heading rules to single rule that figures out the level. This saves a lot of backtracking and inline break complexity.	2011-12-06 11:06:05 +00:00
Gabriel Wicke	33e19f7275	Recognize block-level elements independent of case; Ignore toc and section edit links in tests. 148 parser tests passing.	2011-12-05 20:03:24 +00:00
Gabriel Wicke	9ed9cb31bd	Fix template argument handling somewhat.	2011-12-05 17:58:11 +00:00
Gabriel Wicke	1760210d13	Fixes to tables, headings and misc smaller stuff. Tracked down an issue caused by improperly caching of production results, which interfered with the flag-dependent inline_break production.	2011-12-04 19:23:24 +00:00
Gabriel Wicke	63c728924b	Use pegjs from npm	2011-12-01 15:23:23 +00:00
Antoine Musso	5ab379f479	fix vim modeline	2011-12-01 15:19:37 +00:00
Gabriel Wicke	0ce1e9fcf3	Add a quick html entity decoding hack, and document need for general decoder.	2011-12-01 14:39:55 +00:00
Gabriel Wicke	d00743ad79	Improve external links and definition lists, now 133 tests passing ;) Also add printwhitelist option to test runner, provides js code copy/pastable to whitelist.	2011-12-01 14:25:59 +00:00
Gabriel Wicke	82e31ffd42	Do not allow newlines in various attributes	2011-11-30 15:12:53 +00:00
Gabriel Wicke	821162484e	Allow inlines in the term part of ; term : definition	2011-11-30 14:53:28 +00:00
Gabriel Wicke	f758894de7	Let another test pass by swapping the default order of italic/bold for '''''. Minor test output cosmetics.	2011-11-30 13:54:57 +00:00
Gabriel Wicke	e0fca805a6	Expand tabs in grammar.	2011-11-30 13:42:26 +00:00
Gabriel Wicke	2bb512a4de	A bit of tokenizer grammar clean-up and additional expected-html normalization. 99 parser tests now passing.	2011-11-30 13:40:17 +00:00
Gabriel Wicke	127d8c8621	Simplify DOM paragraph wrapping postprocessor	2011-11-30 12:28:45 +00:00
Gabriel Wicke	f0edc5cb9a	Fix a few more tests by allowing inline content inside links. 76 now passing.	2011-11-29 18:43:27 +00:00
Gabriel Wicke	ae0b5f9af4	* Split paragraph handling between tokenizer and DOM postprocessor for better html markup handling. * Remove global 'use strict' declarations from html5 parser. * Add trailing whitespace handling in dt Overall, 55 parser tests are now passing.	2011-11-29 15:11:51 +00:00
Gabriel Wicke	b16c295b98	Consider dl as a block-level element.	2011-11-28 16:54:58 +00:00
Gabriel Wicke	d3f0196df7	Add primitive HTML comparison to detect passing parser tests. The expected HTML is parsed using a HTML parser and re-serialized, and the output compared to the serialization of the new parser's dom. Newline normalization is a cheap hack for now, need to improve that later.	2011-11-28 11:10:39 +00:00
Gabriel Wicke	6b8c109cf0	Separate block-level tags in tokenizer to delimit inlines and avoid wrapping block-level in paragraphs.	2011-11-25 17:41:26 +00:00
Gabriel Wicke	859379a635	Improvements to nowiki/pre interaction. Will need to distinguish block-level tags from inline HTML tags next.	2011-11-25 15:02:44 +00:00
Gabriel Wicke	dd5cd59ac6	Better HTML, pre and blocklevel handling. Hackish source formatting for easier comparison with parserTest results.	2011-11-25 12:47:03 +00:00
Gabriel Wicke	5b3a4497aa	Add generic HTML tokenization and nowiki handling.	2011-11-25 10:59:43 +00:00
Gabriel Wicke	6c36ddcbce	Follow-up to r104164: Clean-up comments, remove old italic/bold productions.	2011-11-24 14:20:56 +00:00
Gabriel Wicke	dee262658f	Add MediaWiki-compatible quote handling including quirks and overlapped structures like ''[[Link\|Link text'']]. This is another transform on the token stream.	2011-11-24 13:56:30 +00:00
Gabriel Wicke	baf55875b9	Re-add modified wiki list handling to tokenizer.	2011-11-23 14:27:51 +00:00
Gabriel Wicke	694b998f24	Minor improvement to italic/bold, documentation on failed modularization of static parser functions.	2011-11-22 16:51:05 +00:00
Gabriel Wicke	d1b0293569	Fix comment token conversion and serialization	2011-11-21 09:22:30 +00:00
Gabriel Wicke	65afd9b610	Improve internal link handling	2011-11-18 14:48:32 +00:00
Gabriel Wicke	d744e65c48	Add missing token adapter.	2011-11-18 14:00:14 +00:00
Gabriel Wicke	b750ce38b8	Add node.js-compatible HTML5 parser and hook it up to the PEG tokenizer. Builds a DOM tree (jsdom) from the tokens and then serializes that using document.innerHTML. This is all very experimental, so don't be surprised by rough edges.	2011-11-18 13:57:07 +00:00
Gabriel Wicke	11e487d8c0	Flatten inline token lists before merging text into text tokens.	2011-11-17 15:43:31 +00:00
Gabriel Wicke	ea87e7aaee	Convert PEG parser to tokenizer for back-end HTML parser. Now emits a list of tokens, which for now is still completely built before parsing can proceed. For each top-level block, the source start/end positions are added as attributes to the top-most tokens. No tracking of wiki vs. html syntax yet.	2011-11-17 15:26:02 +00:00
Gabriel Wicke	ef3c84bd2e	Extract text from inline elements for better testing. Slightly improved handling of comment-only lines. Change pre to leaf content model.	2011-11-08 16:08:05 +00:00
Gabriel Wicke	18ead89b37	Improved paragraph, br, comment parsing and switched headings to generic inlineline with syntactic flags.	2011-11-07 23:09:30 +00:00
Gabriel Wicke	944d010eb2	Indentation cleanup in PEG parser and Html serializer	2011-11-07 21:05:37 +00:00
Gabriel Wicke	c3a0c56e56	rename definition{term,description} to just {term,description}	2011-11-07 20:36:34 +00:00
Gabriel Wicke	71891131c3	Grammar improvements * replaced regexp stack with a set of break rules for inline content within specialized parse contexts, switched more rules to generic inlineline/inline/block rules. * don't consume end-of-line for proper start-of-line matching * added some pre support * still no conversion of inline elements to annotations	2011-11-07 14:39:12 +00:00
Gabriel Wicke	06ca9f12fe	Rename definitiondata to definitiondescription, minor fixes	2011-11-04 12:25:01 +00:00
Gabriel Wicke	7e5c196732	Some more progress for tables and definition lists	2011-11-04 12:06:49 +00:00
Gabriel Wicke	83a80bad49	Fixes for definition lists	2011-11-04 11:08:11 +00:00
Gabriel Wicke	85def70a8a	Add basic list serialization to HtmlSerializer * Added 'definitionterm' and 'definitiondata' styles to support definition lists, and special-case handling in the serializer to wrap both in dls.	2011-11-04 10:02:59 +00:00
Gabriel Wicke	63398b5749	Update parserTests to latest serializers	2011-11-04 07:45:05 +00:00
Gabriel Wicke	a8838dab18	Start by handling paragraphs, at least a bit.	2011-11-03 15:16:05 +00:00
Gabriel Wicke	0d30a5528e	First combination of WikiDom serializers with existing parser in tests/parser/parserTests.js. * Removed var from es in es.js to allow node.js to access it as global. Only alternative solution appears to be a node-specific 'exports' construct: http://nodejs.org/docs/v0.3.1/api/modules.html * Added es.Document.js and es.Document.Serializer.js in es/bases. Not sure if this is the desired location. * Changed es.extend to es.extendClass in the serializers * Modified the first parser test to include the WikiDom modules and call the new HTML serializer	2011-11-03 13:55:48 +00:00
Trevor Parscal	5bae153214	Moving parser stuff back into the modules folder (oops)	2011-11-02 21:45:57 +00:00
Trevor Parscal	2b499d5990	Reorganized modules by javascript namespace	2011-11-02 21:31:45 +00:00
Brion Vibber	213ee7d4a8	followup r101685: the peg definition	2011-11-02 21:09:19 +00:00
Brion Vibber	56a75ccca7	Copy several of the experimental JS parser bits from ParserPlayground to VisualEditor. They'll need retooling to hook up with the wikidom stuff.	2011-11-02 21:07:51 +00:00

... 3 4 5 6 7 ...

384 commits