Gabriel Wicke
7e22020398
Convert syntactical break flags for templates from counters to the stack
...
variant to fix the precedence for {{!}} (break on these inside table content,
but not in template options within tables).
2012-03-14 16:30:59 +00:00
Gabriel Wicke
77a61dd687
Improve support for {{!}}, and don't produce a pre for indented tables.
2012-03-14 10:58:11 +00:00
Gabriel Wicke
835914b2de
Support {{=}}.
2012-03-14 09:07:01 +00:00
Gabriel Wicke
2195c31abf
Move link types to data-mw-rt, and support some more template tokenization
...
edge cases. For example, the PHP parser treats | foo | = bar | as | foo = bar |,
believe it or not ;)
2012-03-13 12:32:31 +00:00
Gabriel Wicke
4cd8b302ac
Improved template tokenization. The parser can now template-expand
...
[[:en:Barack Obama]] without exceeding 1.7GB of memory (which is the node
limit).
2012-03-12 17:31:45 +00:00
Gabriel Wicke
3c5fe2523c
Tolerate more newlines and spaces in templates, and support templates and
...
comments in urls.
2012-03-12 14:31:06 +00:00
Gabriel Wicke
ae4ab7a39c
Refactor syntactic stops into an object and add a stack variant for option
...
values.
2012-03-12 13:08:43 +00:00
Gabriel Wicke
ffc9383096
Temporary fix for template tokenization, especially needed for
...
[[Template:Cite core]].
2012-03-08 14:24:04 +00:00
Gabriel Wicke
f02ff95aa3
Token representation clean-up. Now all tokens are differentiated using
...
constructors instead of type attributes.
2012-03-07 20:06:54 +00:00
Gabriel Wicke
227103e12c
Accept empty table cell attribute sections, and consider percent-encoded %2525
...
valid. 270 tests passing.
2012-03-06 14:32:45 +00:00
Gabriel Wicke
2efcd3cd57
Reworked percent encoding handling for URIs to get closer to the 'url
...
construction' part of the HTML5 spec:
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#url-manipulation-and-creation
Removed a few whitelisted test cases that are now passing directly.
The encoding canonicalization could also be moved to the Sanitizer. Doing this
early in token stream processing however has the advantage of providing further
transformations uniform data to work with. We could even consider to move this
even further into the tokenizer.
2012-03-06 13:49:37 +00:00
Gabriel Wicke
a9ebc1d986
Support external images wrapped in a clickable link using bracketed external
...
link syntax. 265 tests passing.
2012-03-05 16:23:00 +00:00
Gabriel Wicke
7f7202e89c
A few improvements to external link and image handling. 264 tests passing.
2012-03-05 15:34:27 +00:00
Gabriel Wicke
7b0c807710
Change wikilink tokenization strategy to split on pipes. This makes it
...
possible to support template / template argument expansion in image options,
and causes little trouble for wikilinks. Non-image wikilinks with multiple
text pipes are quite rare in the dumps, and concatenating description tokens
with a plain '|' is quite easy. 261 parser tests passing.
2012-03-05 12:00:38 +00:00
Gabriel Wicke
167dbdb0fa
Parse image options.
2012-03-02 13:36:37 +00:00
Gabriel Wicke
8b7ba9051b
Add productions for image option tokenization, and prepare to call those from
...
the LinkHandler token stream transformer.
2012-03-01 18:07:20 +00:00
Gabriel Wicke
4b9bd45b82
Start to move wikilink expansion to a separate async token transformer.
2012-02-29 13:56:29 +00:00
Gabriel Wicke
b8bb503199
Actually commit onlyinclude, as already announced in r112592.
2012-02-28 13:24:35 +00:00
Gabriel Wicke
491ad5ffef
Cleanup and commenting.
2012-02-22 13:13:18 +00:00
Gabriel Wicke
9b3313d923
Speed up flatten slightly by avoiding garbage for already flat arrays. Also,
...
use simple string concatenation instead of arrays as the strings tend to be
few and short.
2012-02-22 11:25:44 +00:00
Gabriel Wicke
8dde1f77b4
Reduce debug print overhead, roughly a 10% speed-up on parserTests.
2012-02-21 18:49:43 +00:00
Gabriel Wicke
058c4213a4
Remove some more unused code and tidy up some more.
2012-02-21 18:26:40 +00:00
Gabriel Wicke
416126c041
Fix the bug in the inline_breaks replacement, and write another switch-based
...
version, which is slightly faster and shorter. Performance is improved by
about 5% for parserTests.
2012-02-21 17:57:30 +00:00
Gabriel Wicke
18a04f7581
Tidy up and comment the tokenizer a bit more. Start to move code into
...
mediawiki.tokenizer.js module, and pass a reference to parse(). Faster
inline_breaks production using a JS function which seems to be generally
correct, but still breaks five tests when enabled. Seems to be some weird
interaction with peg.js, possibly something to do with caching.
2012-02-21 17:21:42 +00:00
Gabriel Wicke
8718bd65bc
Add list of HTML5 and deprecated HTML3/4 elements in preparation for
...
end-of-potential-extension rules; Support indented tag-wrapped pre blocks.
2012-02-21 14:44:56 +00:00
Gabriel Wicke
059ff94bc4
Reject match for invalid urlencoded code points.
2012-02-16 13:57:56 +00:00
Gabriel Wicke
dc1d30fcb5
Tweaked template parameters a bit further, and made the self-closing tag
...
protection a bit less trigger-happy.
2012-02-15 15:56:11 +00:00
Gabriel Wicke
089413298c
Protect self-closing tags in generic attribute production.
2012-02-15 13:23:50 +00:00
Gabriel Wicke
5e94a238fc
Prepare for the support of tables (and later generally block-level elements)
...
in template parameters. 244 tests passing.
2012-02-15 11:51:29 +00:00
Gabriel Wicke
774a3189c8
Improve support for generic attribute names coming from
...
templates/templateargs.
2012-02-15 10:19:39 +00:00
Gabriel Wicke
1ce6f5a3c4
Improve support for single-line attributes with preprocessor support. 243
...
tests passing.
2012-02-14 21:25:52 +00:00
Gabriel Wicke
f02b3d91c6
Port urlencoded char support to preprocessor-supporting link target
...
production, and remove old link_target production.
2012-02-14 21:08:25 +00:00
Gabriel Wicke
001194b140
Replace console.log with console.warn in all debug statements
2012-02-14 20:56:14 +00:00
Gabriel Wicke
f42b379e52
Fix named wikilink options (image options really) in template arguments, and
...
speed up template parameter parsing by eliminating some backtracking. 238
tests passing (unchanged).
2012-02-14 15:45:18 +00:00
Gabriel Wicke
0b8d1b0387
* Add custom toString methods for tokens to aid debugging
...
* Convert all attributes into strings in Sanitizer
* Use strict comparison against empty string in tokenizer
* Add very simple sitename parserfunction
* 138 tests passing
2012-02-13 17:02:23 +00:00
Gabriel Wicke
025f9cddb3
Prefix all internal data- attributes with data-mw- and adjust the whitelist
...
and test output normalization accordingly. 235 tests passing.
2012-02-13 13:54:07 +00:00
Gabriel Wicke
b1617b1d71
Add some support for ideographic spaces in external links, support the
...
int: namespace alias and perform some normalization on the MediaWiki namespace
prefix.
2012-02-13 13:35:46 +00:00
Gabriel Wicke
a122e51eec
Move data-* annotations into separate object on tokens, that is then
...
serialized into a single data-mw-rt attribute if present. Update parserTests
to ignore this attribute for comparisons with expected parser output.
A few more tweaks and notes are thrown into this commit too. 233 tests are
passing now.
2012-02-11 16:43:25 +00:00
Gabriel Wicke
aff30be131
Some comments and reshuffling in the grammar, and a typo in the
...
AttributeExpander.
2012-02-09 22:27:45 +00:00
Gabriel Wicke
6e33255503
Improve support for preprocessor functionality in attributes; Support
...
multi-line xmlish tags with preprocessor stuff in attributes.
2012-02-09 16:36:29 +00:00
Gabriel Wicke
16ded7d955
Fix a bug in wikilink with trail tokenization.
2012-02-09 14:06:35 +00:00
Gabriel Wicke
3f7c1499cd
Enable support for general preprocessor functionality in attribute keys and
...
values. This includes comments, templates and template arguments.
This also replaces the specialized expansion logic in the TemplateHandler. The
removal of link validation lets one more parser test fail for now. External
link target validation will need to be implemented in the token stream handler
for links. This is noted as TODO in
https://www.mediawiki.org/wiki/Future/Parser_development#Token_stream_transforms .
2012-02-08 15:10:30 +00:00
Gabriel Wicke
1f6db903e9
Pluck a few low-hanging fruit in external link tokenization, and add a simple
...
localurl parser function implementation. 230 parser tests now passing.
2012-02-07 10:28:23 +00:00
Gabriel Wicke
cf8b7bf45d
External links don't nest.
2012-02-07 09:38:28 +00:00
Gabriel Wicke
53bf4f2bd0
Temporarily disable the sanitizer and start to support preprocessor
...
functionality (comments, templates, template arguments) in arbitrary
attributes. The grammar for this is still quite rough, will need to
consolidate that area.
2012-02-06 19:15:44 +00:00
Gabriel Wicke
0bea9fdfbb
Fix nowiki tokenization regression introduced r110495
2012-02-03 13:10:04 +00:00
Gabriel Wicke
8c75aa1a7a
Remove type attribute for tag tokens.
2012-02-01 18:37:48 +00:00
Gabriel Wicke
a5cc10a06b
Change token format to plain strings for text tokens, and specific objects for
...
other tokens. This is only the first half of the conversion. The next step is
to drop the type attribute on most tokens and match on the constructor in the
token transform machinery.
2012-02-01 16:30:43 +00:00
Gabriel Wicke
14a8a13678
A few more debug helpers including a --trace mode for light debugging. Some
...
improvements to parser functions on the way to support the cite extensions.
Preparation for generic template and template arg in attribute support. 222
parser tests now passing.
2012-01-31 16:50:16 +00:00
Gabriel Wicke
7cd94df47d
A few minor tweaks to reduce memory usage
2012-01-27 13:32:44 +00:00