Commit graph

278 commits

Author SHA1 Message Date
Gabriel Wicke dd5cd59ac6 Better HTML, pre and blocklevel handling. Hackish source formatting for easier
comparison with parserTest results.
2011-11-25 12:47:03 +00:00
Gabriel Wicke 5b3a4497aa Add generic HTML tokenization and nowiki handling. 2011-11-25 10:59:43 +00:00
Gabriel Wicke 6c36ddcbce Follow-up to r104164: Clean-up comments, remove old italic/bold productions. 2011-11-24 14:20:56 +00:00
Gabriel Wicke dee262658f Add MediaWiki-compatible quote handling including quirks and overlapped
structures like ''[[Link|Link text'']]. This is another transform on the token
stream.
2011-11-24 13:56:30 +00:00
Gabriel Wicke baf55875b9 Re-add modified wiki list handling to tokenizer. 2011-11-23 14:27:51 +00:00
Gabriel Wicke 694b998f24 Minor improvement to italic/bold, documentation on failed modularization of
static parser functions.
2011-11-22 16:51:05 +00:00
Gabriel Wicke d1b0293569 Fix comment token conversion and serialization 2011-11-21 09:22:30 +00:00
Gabriel Wicke 65afd9b610 Improve internal link handling 2011-11-18 14:48:32 +00:00
Gabriel Wicke d744e65c48 Add missing token adapter. 2011-11-18 14:00:14 +00:00
Gabriel Wicke b750ce38b8 Add node.js-compatible HTML5 parser and hook it up to the PEG tokenizer.
Builds a DOM tree (jsdom) from the tokens and then serializes that using
document.innerHTML. This is all very experimental, so don't be surprised by
rough edges.
2011-11-18 13:57:07 +00:00
Gabriel Wicke 11e487d8c0 Flatten inline token lists before merging text into text tokens. 2011-11-17 15:43:31 +00:00
Gabriel Wicke ea87e7aaee Convert PEG parser to tokenizer for back-end HTML parser. Now emits a list of
tokens, which for now is still completely built before parsing can proceed.
For each top-level block, the source start/end positions are added as
attributes to the top-most tokens. No tracking of wiki vs. html syntax yet.
2011-11-17 15:26:02 +00:00
Gabriel Wicke ef3c84bd2e Extract text from inline elements for better testing. Slightly improved
handling of comment-only lines. Change pre to leaf content model.
2011-11-08 16:08:05 +00:00
Gabriel Wicke 18ead89b37 Improved paragraph, br, comment parsing and switched headings to
generic inlineline with syntactic flags.
2011-11-07 23:09:30 +00:00
Gabriel Wicke 944d010eb2 Indentation cleanup in PEG parser and Html serializer 2011-11-07 21:05:37 +00:00
Gabriel Wicke c3a0c56e56 rename definition{term,description} to just {term,description} 2011-11-07 20:36:34 +00:00
Gabriel Wicke 71891131c3 Grammar improvements
* replaced regexp stack with a set of break rules for inline content within
  specialized parse contexts, switched more rules to generic
  inlineline/inline/block rules.
* don't consume end-of-line for proper start-of-line matching
* added some pre support
* still no conversion of inline elements to annotations
2011-11-07 14:39:12 +00:00
Gabriel Wicke 06ca9f12fe Rename definitiondata to definitiondescription, minor fixes 2011-11-04 12:25:01 +00:00
Gabriel Wicke 7e5c196732 Some more progress for tables and definition lists 2011-11-04 12:06:49 +00:00
Gabriel Wicke 83a80bad49 Fixes for definition lists 2011-11-04 11:08:11 +00:00
Gabriel Wicke 85def70a8a Add basic list serialization to HtmlSerializer
* Added 'definitionterm' and 'definitiondata' styles to support definition
  lists, and special-case handling in the serializer to wrap both in dls.
2011-11-04 10:02:59 +00:00
Gabriel Wicke 63398b5749 Update parserTests to latest serializers 2011-11-04 07:45:05 +00:00
Gabriel Wicke a8838dab18 Start by handling paragraphs, at least a bit. 2011-11-03 15:16:05 +00:00
Gabriel Wicke 0d30a5528e First combination of WikiDom serializers with existing parser in
tests/parser/parserTests.js.

* Removed var from es in es.js to allow node.js to access it as global. Only
  alternative solution appears to be a node-specific 'exports' construct:
  http://nodejs.org/docs/v0.3.1/api/modules.html
* Added es.Document.js and es.Document.Serializer.js in es/bases. Not sure if
  this is the desired location.
* Changed es.extend to es.extendClass in the serializers
* Modified the first parser test to include the WikiDom modules and call the
  new HTML serializer
2011-11-03 13:55:48 +00:00
Trevor Parscal 5bae153214 Moving parser stuff back into the modules folder (oops) 2011-11-02 21:45:57 +00:00
Trevor Parscal 2b499d5990 Reorganized modules by javascript namespace 2011-11-02 21:31:45 +00:00
Brion Vibber 213ee7d4a8 followup r101685: the peg definition 2011-11-02 21:09:19 +00:00
Brion Vibber 56a75ccca7 Copy several of the experimental JS parser bits from ParserPlayground to VisualEditor. They'll need retooling to hook up with the wikidom stuff. 2011-11-02 21:07:51 +00:00