Gabriel Wicke
5248fd31e8
Magic links and behavior switch tokenization by Ori Livneh
...
Commit first patch by Ori, lets 288 parser tests pass. Yay!
Change-Id: Iac8c3d1ad1984900350b20f7e725c40618a1e8ba
2012-04-02 17:31:34 +02:00
Gabriel Wicke
5ef2074251
Enable support for block-level wiki constructs in template arguments. This
...
gets a bit closer to supporting table fragments passed through template
arguments. Next, we'll need a way to indicate start-of-line position to
enable sol block-levels in template parameters.
Example:
{|
{{#if: true|{{!}}Table cell|}}
|}
2012-03-15 11:43:49 +00:00
Gabriel Wicke
7e22020398
Convert syntactical break flags for templates from counters to the stack
...
variant to fix the precedence for {{!}} (break on these inside table content,
but not in template options within tables).
2012-03-14 16:30:59 +00:00
Gabriel Wicke
77a61dd687
Improve support for {{!}}, and don't produce a pre for indented tables.
2012-03-14 10:58:11 +00:00
Gabriel Wicke
835914b2de
Support {{=}}.
2012-03-14 09:07:01 +00:00
Gabriel Wicke
2195c31abf
Move link types to data-mw-rt, and support some more template tokenization
...
edge cases. For example, the PHP parser treats | foo | = bar | as | foo = bar |,
believe it or not ;)
2012-03-13 12:32:31 +00:00
Gabriel Wicke
4cd8b302ac
Improved template tokenization. The parser can now template-expand
...
[[:en:Barack Obama]] without exceeding 1.7GB of memory (which is the node
limit).
2012-03-12 17:31:45 +00:00
Gabriel Wicke
3c5fe2523c
Tolerate more newlines and spaces in templates, and support templates and
...
comments in urls.
2012-03-12 14:31:06 +00:00
Gabriel Wicke
ae4ab7a39c
Refactor syntactic stops into an object and add a stack variant for option
...
values.
2012-03-12 13:08:43 +00:00
Roan Kattouw
29f416937e
Fix some usages of splice.apply in the data model to use
...
ve.batchedSplice(). Added FIXME comments for occurrences outside of DM
2012-03-10 00:31:28 +00:00
Gabriel Wicke
ffc9383096
Temporary fix for template tokenization, especially needed for
...
[[Template:Cite core]].
2012-03-08 14:24:04 +00:00
Gabriel Wicke
39017dd769
Percent-encode spaces in URLs, so that they are recognized as valid URLs later
...
on.
2012-03-08 11:53:15 +00:00
Gabriel Wicke
7518db8197
A few fixes to parser functions and template expansion. Trim whitespace off
...
template arguments, let the last duplicate key win and fake pagenamee slightly
better.
2012-03-08 11:44:37 +00:00
Gabriel Wicke
51023feaa4
Improvements for image option handling.
2012-03-08 10:03:22 +00:00
Gabriel Wicke
b1e131d568
A bit more documentation and naming cleanup in the tokenizer wrapper.
2012-03-08 09:00:45 +00:00
Gabriel Wicke
f02ff95aa3
Token representation clean-up. Now all tokens are differentiated using
...
constructors instead of type attributes.
2012-03-07 20:06:54 +00:00
Gabriel Wicke
f157093a41
Delegate responsibility for resetting the token rank to transforms, if full
...
re-processing in a phase is wanted. By default, after a token type change or
the return of multiple tokens only the remaining transforms with higher ranks
are applied.
Updated a few comments as well.
2012-03-07 19:29:53 +00:00
Gabriel Wicke
1f8c43b9e2
A few minor documentation updates.
2012-03-07 18:42:26 +00:00
Gabriel Wicke
5f618103d7
Set allTokensProcessed flag for async callbacks from the template expander.
2012-03-07 17:36:33 +00:00
Gabriel Wicke
e5a1116817
Start re-transformation as soon as possible in TokenAccumulator._returnTokens
...
to maximize IO concurrency. Signal that all tokens are fully transformed to
callbacks called from TokenAccumulator._returnTokens. The result should be a
single re-transformation when entering the callback chain, and only if the
transform does not signal that it took care of full transformation itself.
Template expansion would set this flag, as the nested transform pipeline
processes all tokens to the end of phase async12.
2012-03-07 16:29:06 +00:00
Gabriel Wicke
656524dbbc
Fixes for multi-transformer expansion in AsyncTransformManager. Added argument
...
to callback which lets transforms indicate if their returned tokens are fully
processed for their phase. If not, the callback re-processes them so that any
remaining transforms are applied.
2012-03-07 15:39:18 +00:00
Gabriel Wicke
af03eb4f29
Improve generic attribute expansion before external link processing, and make
...
wgUploadPath configurable. Also change the hard-coded fall-back image sizes to
sensible defaults. This breaks three parser tests until image size retrieval
from the wiki is implemented.
2012-03-06 18:02:35 +00:00
Gabriel Wicke
227103e12c
Accept empty table cell attribute sections, and consider percent-encoded %2525
...
valid. 270 tests passing.
2012-03-06 14:32:45 +00:00
Gabriel Wicke
2efcd3cd57
Reworked percent encoding handling for URIs to get closer to the 'url
...
construction' part of the HTML5 spec:
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#url-manipulation-and-creation
Removed a few whitelisted test cases that are now passing directly.
The encoding canonicalization could also be moved to the Sanitizer. Doing this
early in token stream processing however has the advantage of providing further
transformations uniform data to work with. We could even consider to move this
even further into the tokenizer.
2012-03-06 13:49:37 +00:00
Gabriel Wicke
19fe9726a2
Fix invalid external link representation. 268 tests passing.
2012-03-05 18:06:29 +00:00
Gabriel Wicke
a9ebc1d986
Support external images wrapped in a clickable link using bracketed external
...
link syntax. 265 tests passing.
2012-03-05 16:23:00 +00:00
Gabriel Wicke
7f7202e89c
A few improvements to external link and image handling. 264 tests passing.
2012-03-05 15:34:27 +00:00
Gabriel Wicke
7b0c807710
Change wikilink tokenization strategy to split on pipes. This makes it
...
possible to support template / template argument expansion in image options,
and causes little trouble for wikilinks. Non-image wikilinks with multiple
text pipes are quite rare in the dumps, and concatenating description tokens
with a plain '|' is quite easy. 261 parser tests passing.
2012-03-05 12:00:38 +00:00
Gabriel Wicke
3e6f1b6bea
Use some options primitively.
2012-03-02 14:19:33 +00:00
Gabriel Wicke
167dbdb0fa
Parse image options.
2012-03-02 13:36:37 +00:00
Gabriel Wicke
8b7ba9051b
Add productions for image option tokenization, and prepare to call those from
...
the LinkHandler token stream transformer.
2012-03-01 18:07:20 +00:00
Gabriel Wicke
b1a7119a46
Hack up some rudimentary image rendering. Using jshashes for the md5, and
...
a few hard-coded image image sizes ;) 262 tests passing.
2012-03-01 13:51:53 +00:00
Gabriel Wicke
d4faf9eaf4
More work on wiki link rendering and general wiki title / namespace
...
functionality.
2012-03-01 12:47:05 +00:00
Gabriel Wicke
4b9bd45b82
Start to move wikilink expansion to a separate async token transformer.
2012-02-29 13:56:29 +00:00
Gabriel Wicke
b8bb503199
Actually commit onlyinclude, as already announced in r112592.
2012-02-28 13:24:35 +00:00
Gabriel Wicke
3227903d48
Follow-up to r112116, accidentally committed from subdirectory.
2012-02-22 16:41:01 +00:00
Gabriel Wicke
3568dfee14
Add some support for functionhooks in test parser and parserTests.js, and
...
tweak a few parser functions.
2012-02-22 15:59:11 +00:00
Gabriel Wicke
d7da324272
Basic fall-through support for #switch parser function
2012-02-22 14:57:50 +00:00
Gabriel Wicke
491ad5ffef
Cleanup and commenting.
2012-02-22 13:13:18 +00:00
Gabriel Wicke
9b3313d923
Speed up flatten slightly by avoiding garbage for already flat arrays. Also,
...
use simple string concatenation instead of arrays as the strings tend to be
few and short.
2012-02-22 11:25:44 +00:00
Gabriel Wicke
8dde1f77b4
Reduce debug print overhead, roughly a 10% speed-up on parserTests.
2012-02-21 18:49:43 +00:00
Gabriel Wicke
058c4213a4
Remove some more unused code and tidy up some more.
2012-02-21 18:26:40 +00:00
Gabriel Wicke
416126c041
Fix the bug in the inline_breaks replacement, and write another switch-based
...
version, which is slightly faster and shorter. Performance is improved by
about 5% for parserTests.
2012-02-21 17:57:30 +00:00
Gabriel Wicke
18a04f7581
Tidy up and comment the tokenizer a bit more. Start to move code into
...
mediawiki.tokenizer.js module, and pass a reference to parse(). Faster
inline_breaks production using a JS function which seems to be generally
correct, but still breaks five tests when enabled. Seems to be some weird
interaction with peg.js, possibly something to do with caching.
2012-02-21 17:21:42 +00:00
Gabriel Wicke
8718bd65bc
Add list of HTML5 and deprecated HTML3/4 elements in preparation for
...
end-of-potential-extension rules; Support indented tag-wrapped pre blocks.
2012-02-21 14:44:56 +00:00
Gabriel Wicke
ffec77273a
Comment and minor code tweaks.
2012-02-21 11:24:20 +00:00
au
ea15bffb27
Revert "* Always sort attributes (+1 test pass)."
...
This reverts commit 45ca281da8eef8030bdd1986418cb914fc9a717c.
2012-02-20 22:26:12 +00:00
Gabriel Wicke
5806705733
Push transformer setup a bit further into the attribute pipeline.
2012-02-20 12:56:00 +00:00
Gabriel Wicke
8eddb4ec6b
Add some comments to the Sanitizer
2012-02-20 11:14:53 +00:00
Gabriel Wicke
71e95bd54b
Set up token stream transformers from a map of phases per input content type.
...
Not yet applied to attribute pipeline creation. 249 tests passing.
2012-02-20 11:07:21 +00:00