Gabriel Wicke
e72dee76e4
Follow-up to r106208 and r106207. Both good catches, thanks Yair! As this code
...
is in its early stages and nowhere near deployment, please Be Bold and just
commit things like this directly! IMHO it makes more sense to fully review this
once it settles down a bit.
2011-12-15 10:13:50 +00:00
Gabriel Wicke
3585bd9c8e
Accept row-only tables. The parser now eats [[en:Barack Obama]] as-is. Hooray!
2011-12-15 00:39:28 +00:00
Gabriel Wicke
6df94a34a1
Less lust for urls
2011-12-15 00:26:22 +00:00
Gabriel Wicke
ce2ee067f7
Minor tweak to wiki link production
2011-12-15 00:12:58 +00:00
Gabriel Wicke
377226a120
Comment out a stray console.log
2011-12-14 23:44:58 +00:00
Gabriel Wicke
574abd9774
A collection of small bug fixes to the grammar, Cite, the Token format
...
converter and the HTML DOM -> WikiDom converter. The tokenizer now digests all
parserTests.
2011-12-14 23:38:46 +00:00
Gabriel Wicke
dc77d73ad5
Add ability to pass through JSON data to WikiDom in data-json-* attributes,
...
and fix parser to actually parse the Barack Obama article except for one table
with nested templates at the start-of-line.
2011-12-14 17:25:09 +00:00
Gabriel Wicke
f6e4267fca
Handle a few more element types, and reset offset for each leaf node. Not sure
...
if the latter is correct, as the documentation at
https://www.mediawiki.org/wiki/Visual_editor/Software_design#Data_Structures
and the actual sample WikiDom in the editor sandbox seem to disagree on this
point.
2011-12-14 16:22:27 +00:00
Gabriel Wicke
6676a47008
Add implicit level attribute to WikiDom headings.
2011-12-14 15:55:58 +00:00
Gabriel Wicke
3018ca690b
Improve WikiDom conversion: Handle text and annotations in branch nodes as
...
paragraphs and treat list items as branches.
2011-12-14 15:40:40 +00:00
Gabriel Wicke
a09aa4d599
Add rough HTML DOM to WikiDom conversion. You can see serialized WikiDom of
...
parser tests using 'node parserTests.js --wikidom'.
2011-12-14 15:15:41 +00:00
Gabriel Wicke
5f80d30428
Clean up access to document and body after building the tree.
2011-12-14 09:40:49 +00:00
Gabriel Wicke
30749b8d8d
Update comments a bit and add a note on things to improve in API.
2011-12-14 09:33:25 +00:00
Gabriel Wicke
55ff272847
Comment TokenTransformDispatcher.
2011-12-13 20:13:09 +00:00
Gabriel Wicke
44deefe303
Minor tweak to comment.
2011-12-13 18:55:44 +00:00
Gabriel Wicke
c61b32eaa7
Clean up and comment the Cite extension a bit.
2011-12-13 18:45:09 +00:00
Gabriel Wicke
feee9ded9f
Convert the Cite extension to a token stream transformer.
...
This required a few further additions to the TokenTransformDispatcher. In
particular, there is now an 'any' token match whose callbacks are executed
before more specific callbacks. This is used by the Cite extension to eat all
tokens between ref and /ref tags. This need is very common, so should be
broken out to an intermediate layer in the future.
In general, the requirements for the TokenTransformDispatcher API are now
clearer, and the API should likely be cleaned up / simplified.
2011-12-13 14:48:47 +00:00
Gabriel Wicke
8e55e79b67
Rename TokenTransformer to TokenTransformDispatcher.
2011-12-13 11:45:12 +00:00
Gabriel Wicke
8231511217
Replace custom object copy with $.extend.
2011-12-13 11:18:15 +00:00
Gabriel Wicke
39aedd4378
Improve comments in QuoteTransformer.
2011-12-13 10:25:18 +00:00
Gabriel Wicke
0ad08b9ae3
Add a README file pointing to the wiki documentation.
2011-12-12 22:30:11 +00:00
Gabriel Wicke
a8fa9433c4
Convert quote handling (italic/bold) to a core extension operating on the
...
token stream. This is the first token transformation exercising the
TokenTransformer class as its dispatcher. Template expansions, wiki link
formatting, tag sanitation and extensions should be able to use the same
dispatcher by registering for specific token types.
The parser performance is very slightly improved as the token stream is only
traversed once.
2011-12-12 20:53:14 +00:00
Gabriel Wicke
752b0990b2
Refactor parserTests somewhat into a class-like structure, and wire up the
...
TokenTransformer.
2011-12-12 14:03:54 +00:00
Gabriel Wicke
d616f07a79
Don't re-build the wiki tokenizer for each test. This speeds up the full
...
parserTests.js run slightly from 7-8 minutes to about 14 seconds ;)
A few very minor tweaks to the grammar are also thrown into this commit.
2011-12-12 10:47:42 +00:00
Gabriel Wicke
89c5e0cafb
Follow-up to r105859: Add missing new.
2011-12-12 10:09:13 +00:00
Gabriel Wicke
9ebce5839a
Further development of the TokenTransformer framework.
2011-12-12 10:01:47 +00:00
Gabriel Wicke
80d5067813
Add a TokenTransformer dispatcher class. This class provides subscriptions by
...
token type, and supports asynchronous token expansion (for example for async
template expansion). This code is not yet tested or used. The interface for
token insertion from transformation functions will be expanded as needed.
2011-12-08 14:37:31 +00:00
Gabriel Wicke
c2b69e2486
Clean up newline handling. Emit a NEWLINE token for each
...
non-{comment,pre,nowiki} newline.
2011-12-08 14:34:18 +00:00
Gabriel Wicke
abc2254110
A bit of comment clean-up and wrapping of tree building into try/catch block
...
to actually count failures.
2011-12-08 11:40:59 +00:00
Gabriel Wicke
92fdf99384
Further renaming, this time from pegParser to pegTokenizer.
2011-12-08 10:59:44 +00:00
Gabriel Wicke
76bc477038
Rename html5TokenEmitter to HTML5TreeBuilder, and the contained Tokenizer to
...
TreeBuilder.
2011-12-08 10:37:18 +00:00
Gabriel Wicke
19a1f0850f
Tidy up the grammar a bit.
2011-12-08 10:33:23 +00:00
Gabriel Wicke
3742d70abd
Add some documentation to syntax flags
2011-12-07 15:54:55 +00:00
Gabriel Wicke
545ca1809f
Convert template argument production to generic inline with syntactic stop.
...
Fix a bug in generic inline production. Nested multi-line templates are now
parsed okayish.
2011-12-07 15:39:39 +00:00
Gabriel Wicke
902db40a1f
Process template arguments into an object.
2011-12-07 14:46:07 +00:00
Gabriel Wicke
51a40e4dbc
Follow-up to r105423: Fix off-by-one bug.
2011-12-07 11:56:12 +00:00
Gabriel Wicke
49c286a67b
Fix a bug in doQuotes (bitten by surprising JS sort() behavior), and improve
...
tag-only-line handling. 180 parser tests now passing.
2011-12-07 11:51:24 +00:00
Gabriel Wicke
418a5067c6
Parse attributes in tables using generic attribute production. Some table
...
tests still do not pass as the MW table output reorders attributes ;)
2011-12-06 22:03:21 +00:00
Gabriel Wicke
3d06707152
Slightly speed up inline tag productions using guards and grouping; Fix list
...
processing function.
2011-12-06 18:35:05 +00:00
Gabriel Wicke
ea8f226fd5
Remove ext and references special cases, now subsumed by generic XML tag
...
productions. Document issue around special tokenizer mode for other extension
tags.
2011-12-06 16:44:27 +00:00
Gabriel Wicke
e7de089d5b
Decode urls and html entities, 163 tests now passing.
2011-12-06 13:17:14 +00:00
Gabriel Wicke
a72a9e55a3
Don't match internal links with url as target. 161 passing.
2011-12-06 12:26:57 +00:00
Gabriel Wicke
2b5cc67bf5
Further tweaks to headings. 157 tests now passing.
2011-12-06 11:59:41 +00:00
Gabriel Wicke
f4d123886e
Convert heading rules to single rule that figures out the level. This saves a
...
lot of backtracking and inline break complexity.
2011-12-06 11:06:05 +00:00
Gabriel Wicke
33e19f7275
Recognize block-level elements independent of case; Ignore toc and section
...
edit links in tests. 148 parser tests passing.
2011-12-05 20:03:24 +00:00
Gabriel Wicke
9ed9cb31bd
Fix template argument handling somewhat.
2011-12-05 17:58:11 +00:00
Gabriel Wicke
1760210d13
Fixes to tables, headings and misc smaller stuff. Tracked down an issue caused
...
by improperly caching of production results, which interfered with the
flag-dependent inline_break production.
2011-12-04 19:23:24 +00:00
Gabriel Wicke
63c728924b
Use pegjs from npm
2011-12-01 15:23:23 +00:00
Antoine Musso
5ab379f479
fix vim modeline
2011-12-01 15:19:37 +00:00
Gabriel Wicke
0ce1e9fcf3
Add a quick html entity decoding hack, and document need for general decoder.
2011-12-01 14:39:55 +00:00