Commit graph

209 commits

Author SHA1 Message Date
Gabriel Wicke 8dde1f77b4 Reduce debug print overhead, roughly a 10% speed-up on parserTests. 2012-02-21 18:49:43 +00:00
Gabriel Wicke 058c4213a4 Remove some more unused code and tidy up some more. 2012-02-21 18:26:40 +00:00
Gabriel Wicke 416126c041 Fix the bug in the inline_breaks replacement, and write another switch-based
version, which is slightly faster and shorter. Performance is improved by
about 5% for parserTests.
2012-02-21 17:57:30 +00:00
Gabriel Wicke 18a04f7581 Tidy up and comment the tokenizer a bit more. Start to move code into
mediawiki.tokenizer.js module, and pass a reference to parse(). Faster
inline_breaks production using a JS function which seems to be generally
correct, but still breaks five tests when enabled. Seems to be some weird
interaction with peg.js, possibly something to do with caching.
2012-02-21 17:21:42 +00:00
Gabriel Wicke 8718bd65bc Add list of HTML5 and deprecated HTML3/4 elements in preparation for
end-of-potential-extension rules; Support indented tag-wrapped pre blocks.
2012-02-21 14:44:56 +00:00
Gabriel Wicke ffec77273a Comment and minor code tweaks. 2012-02-21 11:24:20 +00:00
au ea15bffb27 Revert "* Always sort attributes (+1 test pass)."
This reverts commit 45ca281da8eef8030bdd1986418cb914fc9a717c.
2012-02-20 22:26:12 +00:00
Gabriel Wicke 5806705733 Push transformer setup a bit further into the attribute pipeline. 2012-02-20 12:56:00 +00:00
Gabriel Wicke 8eddb4ec6b Add some comments to the Sanitizer 2012-02-20 11:14:53 +00:00
Gabriel Wicke 71e95bd54b Set up token stream transformers from a map of phases per input content type.
Not yet applied to attribute pipeline creation. 249 tests passing.
2012-02-20 11:07:21 +00:00
au 9c55f5e8b7 * Always sort attributes (+1 test pass).
The performance impact for .sort is quite small (12.079s => 12.158s)
  and Sanitizer is probably one of the more accessible places to do this.
2012-02-18 21:01:07 +00:00
au aa589d989b * Rudimentary CSS validation; +4 tests pass. (Bug 2304, 3244). 2012-02-18 20:16:23 +00:00
Gabriel Wicke 4d80b8daa8 Detail comments about next steps and divide parser functions in those that
need more information from the wiki and readily implementable items.
2012-02-17 10:23:14 +00:00
Gabriel Wicke 059ff94bc4 Reject match for invalid urlencoded code points. 2012-02-16 13:57:56 +00:00
Gabriel Wicke dc1d30fcb5 Tweaked template parameters a bit further, and made the self-closing tag
protection a bit less trigger-happy.
2012-02-15 15:56:11 +00:00
Gabriel Wicke 089413298c Protect self-closing tags in generic attribute production. 2012-02-15 13:23:50 +00:00
Gabriel Wicke 5e94a238fc Prepare for the support of tables (and later generally block-level elements)
in template parameters. 244 tests passing.
2012-02-15 11:51:29 +00:00
Gabriel Wicke 774a3189c8 Improve support for generic attribute names coming from
templates/templateargs.
2012-02-15 10:19:39 +00:00
Gabriel Wicke 1ce6f5a3c4 Improve support for single-line attributes with preprocessor support. 243
tests passing.
2012-02-14 21:25:52 +00:00
Gabriel Wicke f02b3d91c6 Port urlencoded char support to preprocessor-supporting link target
production, and remove old link_target production.
2012-02-14 21:08:25 +00:00
Gabriel Wicke 001194b140 Replace console.log with console.warn in all debug statements 2012-02-14 20:56:14 +00:00
Gabriel Wicke f42b379e52 Fix named wikilink options (image options really) in template arguments, and
speed up template parameter parsing by eliminating some backtracking. 238
tests passing (unchanged).
2012-02-14 15:45:18 +00:00
Gabriel Wicke 64f63b3714 request is automatically installed by jsdom. Follow-up to r111459. Thanks
Hashar!
2012-02-14 14:15:50 +00:00
Gabriel Wicke 466e8e54ad Tweak comment about request module 2012-02-14 14:01:13 +00:00
Gabriel Wicke 0b8d1b0387 * Add custom toString methods for tokens to aid debugging
* Convert all attributes into strings in Sanitizer
* Use strict comparison against empty string in tokenizer
* Add very simple sitename parserfunction
* 138 tests passing
2012-02-13 17:02:23 +00:00
Gabriel Wicke 9945175416 Reformat Date.replaceChars 2012-02-13 14:23:48 +00:00
Gabriel Wicke 0b40741e1c Strip trailing newlines from included templates 2012-02-13 14:17:03 +00:00
Gabriel Wicke 025f9cddb3 Prefix all internal data- attributes with data-mw- and adjust the whitelist
and test output normalization accordingly. 235 tests passing.
2012-02-13 13:54:07 +00:00
Gabriel Wicke b1617b1d71 Add some support for ideographic spaces in external links, support the
int: namespace alias and perform some normalization on the MediaWiki namespace
prefix.
2012-02-13 13:35:46 +00:00
Gabriel Wicke 55ddb4fd66 Remove WikiDom default serialization and --html argument from parse.js
wrapper. HTML ist now the only supported format. The DOMConverter is now no
longer used. Roan, feel free to remove / butcher it for direct HTML to linear
model conversion.
2012-02-11 17:59:17 +00:00
Gabriel Wicke a122e51eec Move data-* annotations into separate object on tokens, that is then
serialized into a single data-mw-rt attribute if present. Update parserTests
to ignore this attribute for comparisons with expected parser output.

A few more tweaks and notes are thrown into this commit too. 233 tests are
passing now.
2012-02-11 16:43:25 +00:00
Gabriel Wicke aff30be131 Some comments and reshuffling in the grammar, and a typo in the
AttributeExpander.
2012-02-09 22:27:45 +00:00
Gabriel Wicke 6e33255503 Improve support for preprocessor functionality in attributes; Support
multi-line xmlish tags with preprocessor stuff in attributes.
2012-02-09 16:36:29 +00:00
Gabriel Wicke 16ded7d955 Fix a bug in wikilink with trail tokenization. 2012-02-09 14:06:35 +00:00
Gabriel Wicke 6983481561 Move attribute expansion back to separate handler, as this makes it easier to
only expand used branches selected by parser functions. Template (and
-argument) expansion is simply registered before general expansion.

Additionally, a few more simple time-based magic words are added in
ParserFunctions.
2012-02-09 13:44:20 +00:00
Gabriel Wicke 3f7c1499cd Enable support for general preprocessor functionality in attribute keys and
values. This includes comments, templates and template arguments.

This also replaces the specialized expansion logic in the TemplateHandler. The
removal of link validation lets one more parser test fail for now. External
link target validation will need to be implemented in the token stream handler
for links. This is noted as TODO in
https://www.mediawiki.org/wiki/Future/Parser_development#Token_stream_transforms.
2012-02-08 15:10:30 +00:00
Gabriel Wicke 157c495a9e Normalize the title in localurl. 232 tests passing. 2012-02-07 12:26:00 +00:00
Gabriel Wicke b4892102a4 Clean up transform callback interface 2012-02-07 11:53:29 +00:00
Gabriel Wicke 1f6db903e9 Pluck a few low-hanging fruit in external link tokenization, and add a simple
localurl parser function implementation. 230 parser tests now passing.
2012-02-07 10:28:23 +00:00
Gabriel Wicke cf8b7bf45d External links don't nest. 2012-02-07 09:38:28 +00:00
Gabriel Wicke 53bf4f2bd0 Temporarily disable the sanitizer and start to support preprocessor
functionality (comments, templates, template arguments) in arbitrary
attributes. The grammar for this is still quite rough, will need to
consolidate that area.
2012-02-06 19:15:44 +00:00
Gabriel Wicke c26243989e Improve toJSON handlers to include all properties 2012-02-06 19:12:29 +00:00
Gabriel Wicke 0bea9fdfbb Fix nowiki tokenization regression introduced r110495 2012-02-03 13:10:04 +00:00
Gabriel Wicke 26f2026cff Add custom JSON serializers for tokens that include a type attribute 2012-02-03 13:09:01 +00:00
Gabriel Wicke 8c75aa1a7a Remove type attribute for tag tokens. 2012-02-01 18:37:48 +00:00
Gabriel Wicke 689f697a93 Push token format conversion a bit further along, and add defines that were
missing in last commit.
2012-02-01 17:03:08 +00:00
Gabriel Wicke a5cc10a06b Change token format to plain strings for text tokens, and specific objects for
other tokens. This is only the first half of the conversion. The next step is
to drop the type attribute on most tokens and match on the constructor in the
token transform machinery.
2012-02-01 16:30:43 +00:00
Gabriel Wicke dd3707ded5 Remove some modules normally bundled with node.js from dependencies, and
remove some older ones that are only used in currently-dead code.
2012-02-01 10:32:33 +00:00
Gabriel Wicke e65c6502c0 Add source for #time implementation in comment 2012-02-01 10:14:01 +00:00
Gabriel Wicke 14a8a13678 A few more debug helpers including a --trace mode for light debugging. Some
improvements to parser functions on the way to support the cite extensions.
Preparation for generic template and template arg in attribute support. 222
parser tests now passing.
2012-01-31 16:50:16 +00:00