Commit graph

20 commits

Author SHA1 Message Date
Gabriel Wicke 2bb512a4de A bit of tokenizer grammar clean-up and additional expected-html
normalization. 99 parser tests now passing.
2011-11-30 13:40:17 +00:00
Gabriel Wicke ae0b5f9af4 * Split paragraph handling between tokenizer and DOM postprocessor for better
html markup handling. 
* Remove global 'use strict' declarations from html5 parser. 
* Add trailing whitespace handling in dt

Overall, 55 parser tests are now passing.
2011-11-29 15:11:51 +00:00
Gabriel Wicke d7537d9777 Improve comment and general data-* attribute normalization. 2011-11-28 16:55:50 +00:00
Gabriel Wicke 1c91daa7be Provide a summary of failures. 2011-11-28 14:53:07 +00:00
Gabriel Wicke a875597530 Keep going of the HTML parser fails to normalize the expected test outcome.
Minor code simplification, and recognition of tr, td and tbody as block-level
elements.
2011-11-28 14:00:14 +00:00
Antoine Musso 9e887cc34c Adds a path fallback to find test file.
I do not fetch mediawiki in ../../../../phase3 . This patch use another
path as a fallback.
2011-11-28 11:41:47 +00:00
Antoine Musso 901b0a8911 list some more needed node module 2011-11-28 11:40:14 +00:00
Gabriel Wicke 901a089358 Shorten diff output and display comments before each failing test. 2011-11-28 11:38:48 +00:00
Gabriel Wicke 5c2a145bdf Add diff output as well. 2011-11-28 11:19:50 +00:00
Gabriel Wicke d3f0196df7 Add primitive HTML comparison to detect passing parser tests. The expected
HTML is parsed using a HTML parser and re-serialized, and the output compared
to the serialization of the new parser's dom. Newline normalization is a
cheap hack for now, need to improve that later.
2011-11-28 11:10:39 +00:00
Gabriel Wicke dd5cd59ac6 Better HTML, pre and blocklevel handling. Hackish source formatting for easier
comparison with parserTest results.
2011-11-25 12:47:03 +00:00
Gabriel Wicke dee262658f Add MediaWiki-compatible quote handling including quirks and overlapped
structures like ''[[Link|Link text'']]. This is another transform on the token
stream.
2011-11-24 13:56:30 +00:00
Gabriel Wicke 8def550629 Fix parserTests path for full svn checkout 2011-11-22 12:32:34 +00:00
Gabriel Wicke d1b0293569 Fix comment token conversion and serialization 2011-11-21 09:22:30 +00:00
Gabriel Wicke b750ce38b8 Add node.js-compatible HTML5 parser and hook it up to the PEG tokenizer.
Builds a DOM tree (jsdom) from the tokens and then serializes that using
document.innerHTML. This is all very experimental, so don't be surprised by
rough edges.
2011-11-18 13:57:07 +00:00
Gabriel Wicke ea87e7aaee Convert PEG parser to tokenizer for back-end HTML parser. Now emits a list of
tokens, which for now is still completely built before parsing can proceed.
For each top-level block, the source start/end positions are added as
attributes to the top-most tokens. No tracking of wiki vs. html syntax yet.
2011-11-17 15:26:02 +00:00
Gabriel Wicke 06ca9f12fe Rename definitiondata to definitiondescription, minor fixes 2011-11-04 12:25:01 +00:00
Gabriel Wicke 63398b5749 Update parserTests to latest serializers 2011-11-04 07:45:05 +00:00
Gabriel Wicke a8838dab18 Start by handling paragraphs, at least a bit. 2011-11-03 15:16:05 +00:00
Gabriel Wicke 0d30a5528e First combination of WikiDom serializers with existing parser in
tests/parser/parserTests.js.

* Removed var from es in es.js to allow node.js to access it as global. Only
  alternative solution appears to be a node-specific 'exports' construct:
  http://nodejs.org/docs/v0.3.1/api/modules.html
* Added es.Document.js and es.Document.Serializer.js in es/bases. Not sure if
  this is the desired location.
* Changed es.extend to es.extendClass in the serializers
* Modified the first parser test to include the WikiDom modules and call the
  new HTML serializer
2011-11-03 13:55:48 +00:00