Commit graph

11 commits

Author SHA1 Message Date
Gabriel Wicke f662690d02 Shorten data-mw-rt to data-mw and clean up whitelist
Instead of a proliferation of data-mw-* attributes, it should be easier to
stash all private / non-semantic round-trip information in a JSON object
stored in data-mw.

Change-Id: Id200a6a8789fa152f29ea530e5a24b6ee7b4b285
2012-04-02 18:12:49 +02:00
au ea15bffb27 Revert "* Always sort attributes (+1 test pass)."
This reverts commit 45ca281da8eef8030bdd1986418cb914fc9a717c.
2012-02-20 22:26:12 +00:00
Gabriel Wicke 8eddb4ec6b Add some comments to the Sanitizer 2012-02-20 11:14:53 +00:00
au 9c55f5e8b7 * Always sort attributes (+1 test pass).
The performance impact for .sort is quite small (12.079s => 12.158s)
  and Sanitizer is probably one of the more accessible places to do this.
2012-02-18 21:01:07 +00:00
au aa589d989b * Rudimentary CSS validation; +4 tests pass. (Bug 2304, 3244). 2012-02-18 20:16:23 +00:00
Gabriel Wicke 0b8d1b0387 * Add custom toString methods for tokens to aid debugging
* Convert all attributes into strings in Sanitizer
* Use strict comparison against empty string in tokenizer
* Add very simple sitename parserfunction
* 138 tests passing
2012-02-13 17:02:23 +00:00
Gabriel Wicke 3f7c1499cd Enable support for general preprocessor functionality in attribute keys and
values. This includes comments, templates and template arguments.

This also replaces the specialized expansion logic in the TemplateHandler. The
removal of link validation lets one more parser test fail for now. External
link target validation will need to be implemented in the token stream handler
for links. This is noted as TODO in
https://www.mediawiki.org/wiki/Future/Parser_development#Token_stream_transforms.
2012-02-08 15:10:30 +00:00
Gabriel Wicke 8c75aa1a7a Remove type attribute for tag tokens. 2012-02-01 18:37:48 +00:00
Gabriel Wicke a5cc10a06b Change token format to plain strings for text tokens, and specific objects for
other tokens. This is only the first half of the conversion. The next step is
to drop the type attribute on most tokens and match on the constructor in the
token transform machinery.
2012-02-01 16:30:43 +00:00
Gabriel Wicke 4bd4307924 Fix comment to reflect the actual regexp/spec in the JS version as well. 2012-01-18 19:35:13 +00:00
Gabriel Wicke 14e6728cc4 Add the start of a minimal sanitizer stage, that only strips IDN ignored
characters from host portions of links hrefs for now. This module needs to be
filled up with pretty much everything Sanitizer.php does, including tag and
attribute whitelists and attribute value sanitation (especially for style
attributes).

We'll also need to think about round-tripping of sanitized tokens.
2012-01-18 01:42:56 +00:00