Commit graph

7 commits

Author SHA1 Message Date
Gabriel Wicke cc96ff4f5e Very basic interwiki support
Pages titles with a wikipedia interwiki prefix now load the page from
corresponding Wikipedia. Links in a page then stay within the given language.

Note that Parsoid currently makes no effort to recognize localized namespaces,
so it won't render media files, categories etc correctly.

Change-Id: I7bc4102e81a402772ea23231170734d580ea15b9
2012-06-05 11:19:58 +02:00
Gabriel Wicke 403be4af42 Add basic thumb rendering support
* DOM based on Wikia's thumb output: HTML5, clean caption without magnify
  icon.
* basic RDFa annotations, but most options additionally in data-mw object-
  might want to move more (or all?) of those into RDFa data using meta tags.
* no support yet for framed or other formats, image scaling etc
* also tweaked some config options in the environment

Change-Id: Ie461fcdce060cfc2dec65cc057709ae650ef3368
2012-04-09 23:04:26 +02:00
Gabriel Wicke af03eb4f29 Improve generic attribute expansion before external link processing, and make
wgUploadPath configurable. Also change the hard-coded fall-back image sizes to
sensible defaults. This breaks three parser tests until image size retrieval
from the wiki is implemented.
2012-03-06 18:02:35 +00:00
Gabriel Wicke 2efcd3cd57 Reworked percent encoding handling for URIs to get closer to the 'url
construction' part of the HTML5 spec:
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#url-manipulation-and-creation

Removed a few whitelisted test cases that are now passing directly.

The encoding canonicalization could also be moved to the Sanitizer. Doing this
early in token stream processing however has the advantage of providing further
transformations uniform data to work with. We could even consider to move this
even further into the tokenizer.
2012-03-06 13:49:37 +00:00
Gabriel Wicke 7b0c807710 Change wikilink tokenization strategy to split on pipes. This makes it
possible to support template / template argument expansion in image options,
and causes little trouble for wikilinks. Non-image wikilinks with multiple
text pipes are quite rare in the dumps, and concatenating description tokens
with a plain '|' is quite easy. 261 parser tests passing.
2012-03-05 12:00:38 +00:00
Gabriel Wicke b1a7119a46 Hack up some rudimentary image rendering. Using jshashes for the md5, and
a few hard-coded image image sizes ;) 262 tests passing.
2012-03-01 13:51:53 +00:00
Gabriel Wicke d4faf9eaf4 More work on wiki link rendering and general wiki title / namespace
functionality.
2012-03-01 12:47:05 +00:00