Commit graph

360 commits

Author SHA1 Message Date
Trevor Parscal 4d03be0301 Added comments and tests for canHaveChildren
Change-Id: I0b9538a89cba4c36d1a8af7395476b9612d18637
2012-04-30 11:57:45 -07:00
Trevor Parscal 822e873775 Fixed numbering in comments in example data
Change-Id: I009f1f32294cce3f154ac3c88e1fa6b2a50a7d1e
2012-04-30 11:43:01 -07:00
Trevor Parscal 489794c89c Renamed rootNode to documentNode and added tests
Within ve.dm.DocumentFragment it makes more sense to call the root node (which is always a document node) a document node, especially since there may be a different node used as a root.

This commit also adds test for getDocumentNode and getNodeFromOffset which uses the offset map.

Change-Id: Ic4609233cedc41f7e5a5f8fdb0e6178652c95554
2012-04-30 11:37:48 -07:00
Trevor Parscal a774b3dbf6 Added test for getOffset map
And fixed ve.dm.DocumentFragment constructor to generate a correct offset map which creates references to branch nodes only

Change-Id: If9e515be0c63d272bfed9bf4da625a48edd36f48
2012-04-27 17:16:29 -07:00
Trevor Parscal fb9f6e0a3b Fixed test module name typo
Change-Id: Ie65c26ec27fc8d8a9d792d41b0dd7deffbd59171
2012-04-27 15:10:43 -07:00
Trevor Parscal 44fe109f14 Added test data and fixed test suite links
Change-Id: Idb5de70b58c525a67f16b21f7adc53214af9b486
2012-04-27 14:59:52 -07:00
Trevor Parscal d4a99f2d26 Removed "prototype." prefix from test names
Change-Id: I9d8993c95782b700d0b8b45b253a3c4c689381a5
2012-04-27 11:03:08 -07:00
Gabriel Wicke 3be4992782 'Obama finally expands' ;) Misc fixes and documentation updates
* [[:en:Barack Obama]] can now be expanded in 77 seconds using 330MB RAM,
  while it would prevously run out of RAM after ~30 minutes. Wohoooo!
  The token transform framework rework really paid off.
* 303 parser tests are passing in the new record time of 5.5 seconds. Two more
  tests are passing since these tests expect the day of the week to be
  Thursday.  Won't be the case tomorrow.

Change-Id: I56e850838476b546df10c6a239c8c9e29a1a3136
2012-04-26 18:18:08 +02:00
Gabriel Wicke 8ff810659a Rename text/wiki and tokens/wiki to text/x-mediawiki and similar
Change-Id: I70113629f4633685cd6db3914303a15e4c79a50a
2012-04-25 20:19:43 +02:00
Gabriel Wicke 8368e17d6a Biggish token transform system refactoring
* All parser pipelines including tokenizer and DOM stuff are now constructed
  from a 'recipe' data structure in a ParserPipelineFactory.

* All sub-pipelines of these can now be cached

* Event registrations to a pipeline are directly forwarded to the last
  pipeline member to save relatively expensive event forwarding.

* Some APIs for on-demand expansion / format conversion of parameters from
  parser functions are added:

  param.to('tokens/expanded', cb)
  param.to('text/wiki', cb) (this does not work yet)

  All parameters are additionally wrapped into a Param object that provides
  method for positional parameter naming (.named() or conversion to a dict
  (.dict()).

* The async token transform manager is now separated from a frame object, with
  the frame holding arguments, an on-demand expansion method and loop checks.

* Only keys of template parameters are now expanded. Parser functions or
  template arguments trigger an expansion on-demand. This (unsurprisingly)
  makes a big performance difference with typical switch-heavy template
  systems.

* Return values from async transforms are no longer used in favor of plain
  callbacks. This saves the complication of having to maintain two code paths.
  A trick in transformTokens still avoids the construction of unneeded
  TokenAccumulators.

* The results of template expansions are no longer buffered.

* 301 parser tests are passing

Known issues:

* Cosmetic cleanup remains to do
* Some parser functions do not support async expansions yet, and need to be
  modified.

Change-Id: I1a7690baffbe8141cadf67270904a1b2e1df879a
2012-04-25 16:51:36 +02:00
Catrope 69df3eefbc Implement ve.NodeFactory and add tests
Change-Id: I34fdf24c0099072fe5f7178400abbc323be975d4
2012-04-23 11:46:30 -07:00
Trevor Parscal 3d6391419d Added more nodes and removed canHave[Grandc|C]hildren methods
Replacing them with static members on each node type

Change-Id: I455debf880bef4e280eea072364f5f57308ec2b1
2012-04-20 16:34:47 -07:00
Catrope f9fd9ea66b Add tests for ve.dm.Transaction
These are currently broken because pushReplace() doesn't update the
lengthDifference

Change-Id: If0b611b7228c54ed15551514e773865be343e63a
2012-04-20 11:29:30 -07:00
Trevor Parscal b004b22241 Added tests for all exceptions
We are now checking for the exception messages as well.

Change-Id: I3a306ce9fe82afe6fd1e46a2e4da4d0a70952688
2012-04-19 18:05:48 -07:00
Trevor Parscal 9a3784301b Improved test coverage for ve.dm.BranchNode.splice and ve.dm.TwigNode.splice
* Changed splice to check all elements about to be inserted are allowed before inserting any of them so that catching an exception leaves you in a sane state
* Fixed the order of execution of parent class constructors in ve.dm.LeafNode and ve.dm.TwigNode so that canHaveChildren and canHaveGrandchildren produce correct values and added tests to ensure these methods are correctly inherited in subclasses
* Added tests that check for exceptions when adding nodes that can have children to nodes that can not have grandchildren
* Added test that check for events being emitted before and after splicing, including that beforeSplice should be emitted even in cases where a splice fails and throws an exception because the nodes are incompatible (but afterSplice is not called in this case) since beforeSplice might modify the nodes in some way before the compatibility tests are run

Change-Id: Id12aea995a42c26ff63a74ae3d31f2bf455759e3
2012-04-19 17:45:58 -07:00
Trevor Parscal f081c0932a A few fix ups for fd49e8d
* Moved getParent and getRoot from ve.dm.Node back to ve.Node
* Fixed use of getElementLength that should have been changed to getOuterLength, but was changed to getLength (oops)

Change-Id: Ibe5b855aef533dcd493f762a8a02c6a11ce6e7de
2012-04-19 16:47:40 -07:00
Trevor Parscal fd49e8df32 Added more tests for ve.*Node and ve.dm.*Node classes
In this commit several methods (child node add/remove and parent/root modification) were also moved to ve.dm.BranchNode ve.dm.Node respectively. ve.Node and ve.BranchNode are immutable. ve.dm.Node and ve.dm.BranchNode are mutable. Other subclasses of ve.Node and ve.BranchNode should implement functionality to mimic changes made to a data model.

Change-Id: Ia9ff78764f8f50f99fc8f9f9593657c0a0bf287e
2012-04-19 16:03:59 -07:00
Trevor Parscal c9ce7dbffe Added some basic coverage for ve.*Node classes
* prototype.canHaveChildren
* prototype.canHaveGrandchildren
* prototype.getType
* prototype.getParent
* prototype.getRoot
* prototype.setRoot
* prototype.attach
* prototype.detach

Change-Id: I920f7c9504e467f4818df537608760165c28d432
2012-04-19 14:43:58 -07:00
Trevor Parscal b16ed2b12d Setup tests, which are still empty
Had to fix a few namespace typos too

Change-Id: I3ebdc418f374bd3e151c516e1c0cfe85398772f0
2012-04-19 14:17:59 -07:00
Gabriel Wicke aaca5eac7d More tweaks: safesubst and image options
* Ignore safesubst for now
* Remove an unneeded whitelist entry
* Make sure the caption is not lost for thumbs (fix to last commit) and remove
  debug print

Change-Id: I243584ed0838cf7c3b4110fe9cdf869272477312
2012-04-17 11:02:52 +02:00
Gabriel Wicke afa5b95bc1 Don't work around html5 library tokenizer attribute reordering
The HTML5 parser we are using to normalize expected HTML output in parserTests
reverses the order of attributes (see
https://github.com/aredridel/html5/pull/53 for the fix). Remove whitelist
entries concerned with this and use the proper order in external image
attributes.

Change-Id: If1868cae05396a150757c85a20473ab756cbcd97
2012-04-16 17:09:06 +02:00
Catrope 7465b670e1 Add and update an offset map in DocumentNode
This has some TODOs still but I want to land it now anyway, and fix the
TODOs later.

* Add this.offsetMap which maps each linear model offset to a model tree node
* Refactor createNodesFromData()
** Rename it to buildSubtreeFromData()
** Have it build an offset map as well as a node subtree
** Have it set the root on the fake root node so that when the subtree
   is attached to the main tree later, we don't get a rippling root
   update all the way down
** Normalize the way the loop processes content, that way adding offsets
   for content is easier
* Add rebuildNodes() which uses buildSubtreeFromData() to rebuild stuff
* Use rebuildNodes() in DocumentSynchronizer
* Use pushRebuild() in TransactionProcessor
* Optimize setRoot() for the case where the root is already set correctly

Change-Id: I8b827d0823c969e671615ddd06e5f1bd70e9d54c
2012-04-13 16:46:02 -07:00
Gabriel Wicke 9913108b40 Fix fetch-parserTests (it is in path instead of fs)
Change-Id: I169502079ea2609a4f4af776b15767cf0c3ec8b5
2012-04-04 20:40:09 +02:00
Adam Wight b234edba88 As much as I have loved writing Makefiles... I've replaced its functionality with package.json, mostly so we can avoid non-node dependencies. This is one of the recommended practices. We should consider moving tests/parser into modules/parser/tests, other node projects keep all module code in one directory.
Explained in the README how to use npm to load the dependencies and run tests.  Too bad about NODE_PATH...

Don't try to find parserTests.txt in assorted places--if it isn't present, fetch from gerrit.  You can symlink from core if you're developing on both parsers, and the fetch script will not overwrite.

Use __dirname in parserTests.js to allow the script to run independent of current working directory.

Change-Id: I4c8b884e91f4fdeae385c7697aff768bdd199dd5
2012-04-04 11:02:58 -07:00
Gabriel Wicke f662690d02 Shorten data-mw-rt to data-mw and clean up whitelist
Instead of a proliferation of data-mw-* attributes, it should be easier to
stash all private / non-semantic round-trip information in a JSON object
stored in data-mw.

Change-Id: Id200a6a8789fa152f29ea530e5a24b6ee7b4b285
2012-04-02 18:12:49 +02:00
Gabriel Wicke 5ef3438ee5 Change path to parserTests from phase3 to core after switch to git.
Change-Id: Ie13f678eaa81447e98db5c8c394ab103caad8454
2012-04-02 17:10:06 +02:00
Audrey Tang d3602bb459 * Get parser tests from GitWeb, not Subversion.
Change-Id: I39f933b9e0320dc62736da07ce097ec1badec9aa
2012-03-28 23:39:01 +08:00
Catrope 7a726b0278 Add tree synchronization for replace
To handle replace operations that are not themselves consistent (these
are common, for instance when replacing an opening element in one place,
then replacing the closing element somewhere else), we process
subsequent replace operations inside the first one until things are
balanced again, then issue a single rebuild for the whole thing.

Change-Id: Ide4613f046fabfeeef383138c39e350b1b710033
2012-03-26 02:51:30 -07:00
Roan Kattouw 662633dfb3 Add a test for unwrapping and rewrapping 2012-03-14 21:02:33 +00:00
Roan Kattouw 2c43a34f74 Rewrite the rebuild action to take two ranges rather than a node and some data. 2012-03-14 21:02:31 +00:00
Roan Kattouw 37a59016e8 Break out pushAction() into separate functions for each action. This will allow me to change the rebuild action to take totally different parameters. 2012-03-14 21:02:29 +00:00
Antoine Musso f637756319 node modules required: request & jshashes 2012-03-13 15:14:18 +00:00
Roan Kattouw 16a2356e43 Add tests for list split tree sync 2012-03-13 00:14:38 +00:00
Trevor Parscal c977591886 Added test for ve.dm.DocumentSynchronizer that exercises multi-action synchronizations 2012-03-09 19:38:54 +00:00
Roan Kattouw d70aa70707 Add test for replacing a table with a list. This only works because
nesting validity isn't checked yet (lists inside lists are illegal
IIRC), but for now it tests the reversal of the order of the closing
tags nicely
2012-03-09 02:19:50 +00:00
Roan Kattouw b13d0a849d Add a check for the length of unwrapOuter, and add a test for each
exception
2012-03-09 01:44:31 +00:00
Roan Kattouw bc600b34be Make prepareWrap() use the data from the model rather than the unwrap
parameters. This fixes the case where rolling back a list unwrap would
restore the list items without their attributes
2012-03-09 01:14:41 +00:00
Roan Kattouw 3bc6b3d8c7 Add tests for unwrapping a list
This also excercises unwrapEach. One of the tests is still subtly broken
in that the attributes on the listItems aren't preserved, I'll fix that
next.
2012-03-09 00:38:35 +00:00
Roan Kattouw 5054ed320e Implement prepareWrap and add tests for it 2012-03-08 23:21:26 +00:00
Roan Kattouw 10a6ee73f4 Add tests for content replacements 2012-03-08 23:21:23 +00:00
Trevor Parscal 3ec0c07843 Fixed name of test suite to match actual class name 2012-03-08 19:37:13 +00:00
Trevor Parscal becb1daa39 Added more tests for ve.dm.DocumentSynchronizer and fixed some bugs along the way 2012-03-08 19:35:51 +00:00
Trevor Parscal 459c4fa271 Added some basic tests for resize and insert. Fixed some bugs in both of those code paths along the way. 2012-03-08 00:52:30 +00:00
Gabriel Wicke af03eb4f29 Improve generic attribute expansion before external link processing, and make
wgUploadPath configurable. Also change the hard-coded fall-back image sizes to
sensible defaults. This breaks three parser tests until image size retrieval
from the wiki is implemented.
2012-03-06 18:02:35 +00:00
Gabriel Wicke 227103e12c Accept empty table cell attribute sections, and consider percent-encoded %2525
valid. 270 tests passing.
2012-03-06 14:32:45 +00:00
Gabriel Wicke 2efcd3cd57 Reworked percent encoding handling for URIs to get closer to the 'url
construction' part of the HTML5 spec:
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#url-manipulation-and-creation

Removed a few whitelisted test cases that are now passing directly.

The encoding canonicalization could also be moved to the Sanitizer. Doing this
early in token stream processing however has the advantage of providing further
transformations uniform data to work with. We could even consider to move this
even further into the tokenizer.
2012-03-06 13:49:37 +00:00
Gabriel Wicke 19fe9726a2 Fix invalid external link representation. 268 tests passing. 2012-03-05 18:06:29 +00:00
Gabriel Wicke 7b0c807710 Change wikilink tokenization strategy to split on pipes. This makes it
possible to support template / template argument expansion in image options,
and causes little trouble for wikilinks. Non-image wikilinks with multiple
text pipes are quite rare in the dumps, and concatenating description tokens
with a plain '|' is quite easy. 261 parser tests passing.
2012-03-05 12:00:38 +00:00
Trevor Parscal 0e41da3340 Fixed tests that were broken by r112150. 2012-03-02 23:12:38 +00:00
Gabriel Wicke 009d7a4dea Namespaces to the rescue. 2012-03-02 15:49:05 +00:00
Gabriel Wicke fe681042c0 Collect some statistics while grepping. 2012-03-01 16:42:28 +00:00
Gabriel Wicke e0838db315 Capturing the regexp is no longer necessary, and speeds up the grepper. Also
tweaked the multi-line ISBN regexp slightly.
2012-02-29 13:02:46 +00:00
Gabriel Wicke e3deb304db Add a misc regexp file for dump grepping. 2012-02-29 11:07:17 +00:00
Gabriel Wicke 14f40aa7d5 Support capturing regexps in dumpGrepper. 2012-02-29 10:49:00 +00:00
Gabriel Wicke ebcfc2c7a1 Improve grepper documentation. 2012-02-28 14:24:37 +00:00
Gabriel Wicke b767e03449 Tweak martian regexp and grepper output format. 2012-02-28 14:11:44 +00:00
Gabriel Wicke 4806505ce4 Finish color highlighting for dump grepper / fix broken commit r112592. 2012-02-28 13:48:47 +00:00
Gabriel Wicke 7daeb34d4d Implement onlyinclude transformer. 254 tests passing. 2012-02-28 13:21:01 +00:00
Gabriel Wicke 32012c00cd Add martian-endtags regexp wrapper around dumpGrepper. 2012-02-27 16:51:20 +00:00
Gabriel Wicke 19c67c28a2 Add a simple dump grepper using DumpReader. Useful to inform parser design
decisions, and as a way to exercise the dump reader in preparation for tests
over full dumps.
2012-02-27 16:40:01 +00:00
Gabriel Wicke 21855c99cd Tweak dumpReader to work with current libxmljs and stdin 'data' events. 2012-02-27 15:46:08 +00:00
Gabriel Wicke 2e41b19af8 Green two more parser tests by implementing some parser functions. 2012-02-22 16:39:50 +00:00
Gabriel Wicke 3568dfee14 Add some support for functionhooks in test parser and parserTests.js, and
tweak a few parser functions.
2012-02-22 15:59:11 +00:00
au f1fb937b4a * Instead of sorting attributes, whitelist the one parserTest where it matters. 2012-02-20 22:26:24 +00:00
au 0ca9b00100 * Convert __patched-html-parser to .coffee.
Note that the compiled .js file (generated by "make"/"make test")
  is still under version control so folks can work on the project
  even without a running "coffee" command in PATH.

  Also updated README to mention coffee-script and "make test".
2012-02-18 18:54:12 +00:00
au 4d1c6c7d6e * Add a "make test" target that auto-fetches parserTests.txt. 2012-02-18 17:28:46 +00:00
au 0360e62da7 * Locally apply the HTML5.Marker.type patch.
This is needed until https://github.com/aredridel/html5/issues/44
  is merged into the upstream "html5" module.
2012-02-18 17:28:35 +00:00
Gabriel Wicke 025f9cddb3 Prefix all internal data- attributes with data-mw- and adjust the whitelist
and test output normalization accordingly. 235 tests passing.
2012-02-13 13:54:07 +00:00
Gabriel Wicke a122e51eec Move data-* annotations into separate object on tokens, that is then
serialized into a single data-mw-rt attribute if present. Update parserTests
to ignore this attribute for comparisons with expected parser output.

A few more tweaks and notes are thrown into this commit too. 233 tests are
passing now.
2012-02-11 16:43:25 +00:00
Gabriel Wicke 1f6db903e9 Pluck a few low-hanging fruit in external link tokenization, and add a simple
localurl parser function implementation. 230 parser tests now passing.
2012-02-07 10:28:23 +00:00
Gabriel Wicke d321d96bab Fix parserTests summary with filtering enabled 2012-02-07 09:27:47 +00:00
Trevor Parscal 5d71c888f9 Updated unit tests in response to structural changes in r110805 2012-02-07 00:12:31 +00:00
Gabriel Wicke a5b7ea7bcd Add --debug and --trace options to parserTests as well. 2012-02-01 17:02:37 +00:00
Gabriel Wicke 7cd94df47d A few minor tweaks to reduce memory usage 2012-01-27 13:32:44 +00:00
Gabriel Wicke 348cac6cf0 Fix a bug in TokenCollector, and misc tweaks for template expansions. 2012-01-20 18:47:17 +00:00
Gabriel Wicke 2233d0a488 Eventify parser tests and parse.js commandline wrapper to actuallly allow
async template fetching. Async expansion is not yet fully debugged, but at
least the preconditions for that are now there.
2012-01-18 23:46:01 +00:00
Gabriel Wicke 34025251a3 Clean up 'END' token handling a bit. 2012-01-17 20:01:21 +00:00
Gabriel Wicke f4081bef08 First template expansion tests start working, and a bug fix in
DOMPostProcessor paragraph wrapper. 187 parser tests now passing.
2012-01-14 00:58:20 +00:00
Gabriel Wicke 5ec30252f1 More token transform and pipeline setup refactoring to support template
expansion better.
2012-01-10 01:09:50 +00:00
Gabriel Wicke 2e35171fd1 Fix quote handling and tweak the whitelist a bit. 'any' token registrations
are now merged with specific registrations by rank. Not yet clear if that is a
good idea overall, need to check use cases when implementing template expansion
and other functionality.

183 parser test now passing.
2012-01-04 14:09:05 +00:00
Gabriel Wicke 29362cc53c Rename ParseThingy to ParserPipeline and fix up broken WikiDom generation and
commandline runner.
2012-01-04 08:39:45 +00:00
Gabriel Wicke bd98eb4c5a Land big TokenTransformDispatcher and eventization refactoring.
The TokenTransformDispatcher now actually implements an asynchronous, phased
token transformation framework as described in
https://www.mediawiki.org/wiki/Future/Parser_development/Token_stream_transformations.

Additionally, the parser pipeline is now mostly held together using events.
The tokenizer still emits a lame single events with all tokens, as block-level
emission failed with scoping issues specific to the PEGJS parser generator.
All stages clean up when receiving the end tokens, so that the full pipeline
can be used for repeated parsing.

The QuoteTransformer is not yet 100% fixed to work with the new interface, and
the Cite extension is disabled for now pending adaptation. Bold-italic related
tests are failing currently.
2012-01-03 18:44:31 +00:00
Gabriel Wicke 8e00a72d0a Improvements to link trail handling, and two tweaks to the whitelist. 182
tests now passing. 

Link trails depend on language-dependent positive character classes in the PHP
parser. These classes all seem to disallow punctuation implicitly and list
differing plain text characters instead, so it might be possible to get away
with identifying a common class of non-trail punctuation instead. This would
help to keep the tokenizer independent of configurations, which is very
desirable for caching and simplified external parsing.
2011-12-30 12:47:06 +00:00
Gabriel Wicke b3a0270d69 Remove env and load grammar in tokenizer constructor. Re-add property hack to
keep parserTests running for now. Really need a different pipeline for html
serialization or a reference to the HTML DOM.
2011-12-28 17:04:16 +00:00
Neil Kandalgaonkar 8fbf36e63e put add terminal token inside tokenize method (will pull it out again for streaming interface) 2011-12-28 01:37:15 +00:00
Neil Kandalgaonkar 6103646ec8 remove need to add newline at end of input 2011-12-28 01:37:11 +00:00
Neil Kandalgaonkar 4158f82d7e refactor parser to ParseThingy in different module, can be invoked with command line utility parse.js 2011-12-28 01:37:06 +00:00
Neil Kandalgaonkar aedc6751ae made parseThingy, temp class for refactoring all thingies related to parsing 2011-12-28 01:36:58 +00:00
Neil Kandalgaonkar 5ff2b4d475 make peg src path outside of peg tokenizer 2011-12-28 01:36:50 +00:00
Neil Kandalgaonkar 962d1262fc create tokenizer without need to modify namespace with PEG source 2011-12-28 01:36:36 +00:00
Gabriel Wicke 1c7fe0eb34 Refactor table productions to support table fragments in templates (table
start / row / table end). The old productions are not deleted yet to make it
easy to compare the output on more complex articles. 181 tests passing after
adding two table tests with whitespace-only differences to the whitelist.
2011-12-22 11:43:55 +00:00
Gabriel Wicke 574abd9774 A collection of small bug fixes to the grammar, Cite, the Token format
converter and the HTML DOM -> WikiDom converter. The tokenizer now digests all
parserTests.
2011-12-14 23:38:46 +00:00
Gabriel Wicke dc77d73ad5 Add ability to pass through JSON data to WikiDom in data-json-* attributes,
and fix parser to actually parse the Barack Obama article except for one table
with nested templates at the start-of-line.
2011-12-14 17:25:09 +00:00
Gabriel Wicke a09aa4d599 Add rough HTML DOM to WikiDom conversion. You can see serialized WikiDom of
parser tests using 'node parserTests.js --wikidom'.
2011-12-14 15:15:41 +00:00
Gabriel Wicke 5f80d30428 Clean up access to document and body after building the tree. 2011-12-14 09:40:49 +00:00
Gabriel Wicke feee9ded9f Convert the Cite extension to a token stream transformer.
This required a few further additions to the TokenTransformDispatcher. In
particular, there is now an 'any' token match whose callbacks are executed
before more specific callbacks. This is used by the Cite extension to eat all
tokens between ref and /ref tags. This need is very common, so should be
broken out to an intermediate layer in the future.

In general, the requirements for the TokenTransformDispatcher API are now
clearer, and the API should likely be cleaned up / simplified.
2011-12-13 14:48:47 +00:00
Gabriel Wicke c33f74d227 Follow-up to r106001: Fix typo spotted by Nikerabbit. Good catch! 2011-12-13 13:00:57 +00:00
Gabriel Wicke 8e55e79b67 Rename TokenTransformer to TokenTransformDispatcher. 2011-12-13 11:45:12 +00:00
Gabriel Wicke 815c63ba6c Disabled es* inclusion for now as the serializers are not currently used, and
the recent addition of references to window are not compatible with node.js.
2011-12-13 11:17:33 +00:00
Gabriel Wicke dc70687ed0 Update README 2011-12-13 10:03:01 +00:00
Gabriel Wicke a8fa9433c4 Convert quote handling (italic/bold) to a core extension operating on the
token stream. This is the first token transformation exercising the
TokenTransformer class as its dispatcher. Template expansions, wiki link
formatting, tag sanitation and extensions should be able to use the same
dispatcher by registering for specific token types.

The parser performance is very slightly improved as the token stream is only
traversed once.
2011-12-12 20:53:14 +00:00
Gabriel Wicke 752b0990b2 Refactor parserTests somewhat into a class-like structure, and wire up the
TokenTransformer.
2011-12-12 14:03:54 +00:00
Gabriel Wicke d616f07a79 Don't re-build the wiki tokenizer for each test. This speeds up the full
parserTests.js run slightly from 7-8 minutes to about 14 seconds ;)

A few very minor tweaks to the grammar are also thrown into this commit.
2011-12-12 10:47:42 +00:00
Gabriel Wicke abc2254110 A bit of comment clean-up and wrapping of tree building into try/catch block
to actually count failures.
2011-12-08 11:40:59 +00:00
Gabriel Wicke 92fdf99384 Further renaming, this time from pegParser to pegTokenizer. 2011-12-08 10:59:44 +00:00
Gabriel Wicke 76bc477038 Rename html5TokenEmitter to HTML5TreeBuilder, and the contained Tokenizer to
TreeBuilder.
2011-12-08 10:37:18 +00:00
Gabriel Wicke 1d299f6aa9 Also print out options for failing tests. 2011-12-07 11:45:05 +00:00
Gabriel Wicke 0734fb24c5 Add a few more items to the whitelist 2011-12-07 11:44:38 +00:00
Gabriel Wicke 7e1585d360 Add empty tables to the whitelist (legal in HTML5). Also add one more
functionally identical italic/bold/link permmutation on the whitelist.
2011-12-06 22:05:43 +00:00
Trevor Parscal e61e66856c Fixed issue in transaction processor's insert method - no need for a special case for structural offsets anymore 2011-12-06 22:04:18 +00:00
Trevor Parscal 88f22ec10f Added test which currently fails because Transaction processor is broken 2011-12-06 21:36:36 +00:00
Gabriel Wicke 1a5ffacc5c Add slightly different but functionally identical italic/bold/link nesting to
whitelist.
2011-12-06 16:45:19 +00:00
Gabriel Wicke a922d595cf Really minor: Add a newline after whitelist printout. 2011-12-06 13:16:43 +00:00
Gabriel Wicke 1bd3f8321e Minor beautification of whitelist entry print-out header. 2011-12-06 12:35:32 +00:00
Gabriel Wicke 228fccd0c1 Strip toc and edit sections from expected html for now. 2011-12-06 11:39:53 +00:00
Antoine Musso 350d1e8978 util.inspect to dump tokens
It gets a better output over JSON.stringify since inspect nicely indent
the object/array dump. Makes it easier to read for humans.
2011-12-06 10:23:58 +00:00
Gabriel Wicke 33e19f7275 Recognize block-level elements independent of case; Ignore toc and section
edit links in tests. 148 parser tests passing.
2011-12-05 20:03:24 +00:00
Trevor Parscal 07af0cab63 * Moved getContent and getText from leaf nodes to document model nodes
* Renamed getContent to getContentData
* Renamed getText to getContentText
* Added getElementData
2011-12-05 19:41:04 +00:00
Gabriel Wicke a6867d76c5 Ignore missing redlink for now, we are concerned with the parser and not a
complete wiki at this stage.
2011-12-05 17:07:06 +00:00
Gabriel Wicke 1760210d13 Fixes to tables, headings and misc smaller stuff. Tracked down an issue caused
by improperly caching of production results, which interfered with the
flag-dependent inline_break production.
2011-12-04 19:23:24 +00:00
Antoine Musso 7ead617a2e --cache to save the test cases parsing
This is optional but speed up launchtime when other files are not
modified.
2011-12-01 17:51:07 +00:00
Antoine Musso c21a81ee45 warn on invalid regex passed to --filter 2011-12-01 15:45:40 +00:00
Gabriel Wicke 63c728924b Use pegjs from npm 2011-12-01 15:23:23 +00:00
Gabriel Wicke d00743ad79 Improve external links and definition lists, now 133 tests passing ;)
Also add printwhitelist option to test runner, provides js code copy/pastable
to whitelist.
2011-12-01 14:25:59 +00:00
Antoine Musso cb682c5ade option to disable color output (use --no-color ) 2011-12-01 12:30:15 +00:00
Gabriel Wicke 5d50c6bbf3 Follow-up to r104845: s/args/argv 2011-12-01 12:10:43 +00:00
Gabriel Wicke edf40c616c Make whitelist usage an option; tweak comment a bit 2011-12-01 11:47:22 +00:00
Gabriel Wicke 5f72acec8f Add option to disable whitelist 2011-12-01 11:08:05 +00:00
Gabriel Wicke 35efed6634 Add a parser test whitelist for manually-checked tests, and an option to print
JSON-serialized parser output for failing tests, which can then be added to
the whitelist if appropriate.
2011-12-01 10:58:12 +00:00
Gabriel Wicke e7f182d786 Strip the patch header lines, don't really need those 2011-11-30 18:21:53 +00:00
Antoine Musso 2b6d1896cb colorize numbers in test summary
Also added Brion's ALL TEST PASSED when it makes sense
2011-11-30 17:43:54 +00:00
Antoine Musso ed74636ab5 --quick Suppress diff output of failed tests
A long block of code was not reindented to make this patch easier
to read for people not ignoring white spaces changes :D
2011-11-30 17:18:24 +00:00
Antoine Musso ebfc6f08fd --quiet suppress notification of passed tests
--no-quiet will make sure you always see PASSING tests :)
2011-11-30 17:10:07 +00:00
Antoine Musso 3038df313f allow test filtering using a regexp on title test
use --filter :)
2011-11-30 17:03:29 +00:00
Antoine Musso 513b2e85b7 proper argument handling (requires 'optimist' module)
Handle arguments and options properly by using the 'optimist' node module.
Please note wordwrapping in usage does not seem to work on my setup :(

Only --help implemented yet.

Example:

$ node parserTests.js --help
Starting up JS parser tests
Usage: node ./parserTests.js

Options:
  --filter, --regex  Only run tests whose descriptions which match given regex (option not implemented)
  --help, -h         Show this help message                                                            
  --disabled         Run disabled tests (default false) (option not implemented)                         [boolean]
2011-11-30 16:33:26 +00:00
Antoine Musso 567ef896e7 add some console messages 2011-11-30 15:33:56 +00:00
Antoine Musso 302e1519b3 add colors to visual editor parser testing
TODO: add an option to switch color scheme for light/dark backgrounds
2011-11-30 15:20:46 +00:00
Gabriel Wicke f758894de7 Let another test pass by swapping the default order of italic/bold for '''''.
Minor test output cosmetics.
2011-11-30 13:54:57 +00:00
Gabriel Wicke 2bb512a4de A bit of tokenizer grammar clean-up and additional expected-html
normalization. 99 parser tests now passing.
2011-11-30 13:40:17 +00:00
Gabriel Wicke ae0b5f9af4 * Split paragraph handling between tokenizer and DOM postprocessor for better
html markup handling. 
* Remove global 'use strict' declarations from html5 parser. 
* Add trailing whitespace handling in dt

Overall, 55 parser tests are now passing.
2011-11-29 15:11:51 +00:00
Gabriel Wicke d7537d9777 Improve comment and general data-* attribute normalization. 2011-11-28 16:55:50 +00:00
Gabriel Wicke 1c91daa7be Provide a summary of failures. 2011-11-28 14:53:07 +00:00
Gabriel Wicke a875597530 Keep going of the HTML parser fails to normalize the expected test outcome.
Minor code simplification, and recognition of tr, td and tbody as block-level
elements.
2011-11-28 14:00:14 +00:00
Antoine Musso 9e887cc34c Adds a path fallback to find test file.
I do not fetch mediawiki in ../../../../phase3 . This patch use another
path as a fallback.
2011-11-28 11:41:47 +00:00
Antoine Musso 901b0a8911 list some more needed node module 2011-11-28 11:40:14 +00:00
Gabriel Wicke 901a089358 Shorten diff output and display comments before each failing test. 2011-11-28 11:38:48 +00:00
Gabriel Wicke 5c2a145bdf Add diff output as well. 2011-11-28 11:19:50 +00:00
Gabriel Wicke d3f0196df7 Add primitive HTML comparison to detect passing parser tests. The expected
HTML is parsed using a HTML parser and re-serialized, and the output compared
to the serialization of the new parser's dom. Newline normalization is a
cheap hack for now, need to improve that later.
2011-11-28 11:10:39 +00:00
Gabriel Wicke dd5cd59ac6 Better HTML, pre and blocklevel handling. Hackish source formatting for easier
comparison with parserTest results.
2011-11-25 12:47:03 +00:00
Roan Kattouw 5ac817a6f4 Fix bugs in prepareContentAnnotation() related to structural offsets, and add a test. Also add parenthesis to the if statement mixing || and &&, for clarity 2011-11-24 16:27:40 +00:00
Roan Kattouw 6f3c407314 Introduce es.DocumentNode.getCommonAncestorPaths(), with tests 2011-11-24 15:34:12 +00:00
Gabriel Wicke dee262658f Add MediaWiki-compatible quote handling including quirks and overlapped
structures like ''[[Link|Link text'']]. This is another transform on the token
stream.
2011-11-24 13:56:30 +00:00
Trevor Parscal 3ed6544fe2 Added test that exposes bugs in prepareContentAnnotation 2011-11-23 23:24:05 +00:00
Trevor Parscal 8ed6ee5e3c Fixed a test which was poorly named and had incorrect data 2011-11-23 19:37:57 +00:00
Trevor Parscal 20da830a26 Rewrite of undo/redo - now completely implemented in es.SurfaceModel 2011-11-22 22:59:05 +00:00
Gabriel Wicke 8def550629 Fix parserTests path for full svn checkout 2011-11-22 12:32:34 +00:00
Trevor Parscal 631323b9bd * Refactored es.HistoryModel to always be working from a single array rather than a buffer and an array
* Added support for associating a selection with a state
2011-11-21 23:51:37 +00:00
Trevor Parscal 779a63f486 * Switched to using JSON for hashing, allowing us to use the native JSON.stringify where available, which is much faster
* Added a bunch of utility functions for working with character data and annotations
* Got toolbar button states to follow selection of more than one character
2011-11-21 22:32:22 +00:00
Gabriel Wicke d1b0293569 Fix comment token conversion and serialization 2011-11-21 09:22:30 +00:00
Gabriel Wicke b750ce38b8 Add node.js-compatible HTML5 parser and hook it up to the PEG tokenizer.
Builds a DOM tree (jsdom) from the tokens and then serializes that using
document.innerHTML. This is all very experimental, so don't be surprised by
rough edges.
2011-11-18 13:57:07 +00:00
Roan Kattouw 35a99b4be0 Make es.TransactionProcessor.remove() handle deep merges correctly, and add test cases. The code is still a bit rough and ugly and needs a bit more work, but I'll clean that up later; at least it works now. 2011-11-18 10:17:35 +00:00
Trevor Parscal 48e7f4c3c6 Initial checkin of new es.HistoryModel (needs tests) 2011-11-17 22:44:11 +00:00
Trevor Parscal 6fded56cec Renamed es.Transaction to es.TransactionModel 2011-11-17 22:42:18 +00:00
Roan Kattouw 117c785d85 Improve the merging logic in prepareRemoval() to also allow merging nested nodes, e.g. by deleting </p></li><li><p> 2011-11-17 19:23:15 +00:00
Gabriel Wicke ea87e7aaee Convert PEG parser to tokenizer for back-end HTML parser. Now emits a list of
tokens, which for now is still completely built before parsing can proceed.
For each top-level block, the source start/end positions are added as
attributes to the top-most tokens. No tracking of wiki vs. html syntax yet.
2011-11-17 15:26:02 +00:00
Roan Kattouw be994da373 Make selectNodes() also descend (recurse) into child nodes when only the start or only the end is in the middle of a child node. Without this, it was stuff like ranges with only openings and no closings. 2011-11-17 15:01:47 +00:00
Roan Kattouw 2c21250c70 Make selectNodes() not return an empty array when encountering a zero-length selection in a structural location (we don't do this for zero-length selections in content locations either, and the empty array is breaking an assumption I was making in my prepareRemoval rewrite) 2011-11-17 14:50:38 +00:00
Roan Kattouw ef478bfe7b Make the selectNodes tests log their failures to the console 2011-11-17 14:42:14 +00:00
Roan Kattouw e8899405e9 Fix bug in compare() which caused it to return true for arrays of unequal length (!!) 2011-11-17 14:21:39 +00:00
Trevor Parscal b89d7d7eeb Removed some accidental globals 2011-11-16 23:32:57 +00:00
Trevor Parscal cd033e02c4 Renamed es.DocumentNode.test to es.DocumentBranchNode.test to reflect it's contents 2011-11-16 19:35:18 +00:00
Trevor Parscal 607e0fe3ad Fixed test data in response to r103354 2011-11-16 19:26:16 +00:00
Roan Kattouw cb8a14b954 Add test cases to illustrate the breakage in r103271 2011-11-16 19:07:17 +00:00
Trevor Parscal 8a2e8b4aab Rewrote prepareRemoval to support dropping nodes that are considered droppable (not tableCells) and are covered completely by the range - otherwise nodes are stripped of content 2011-11-16 00:03:17 +00:00
Trevor Parscal 79ef19da42 Fixed documentation and use of es.arrayIndexOf to match the actual API of $.inArray (value, array, fromIndex). Renamed function to inArray to reduce confusion about how the function works. 2011-11-15 18:17:26 +00:00
Roan Kattouw c9d2bd84d1 Rewrite traverseLeafNodes tests using a data provider pattern 2011-11-15 14:35:03 +00:00
Roan Kattouw 2a80194223 Add more tests 2011-11-15 13:50:24 +00:00
Roan Kattouw fee2d48b2b Add very basic implementation of traverseNodes(), with tests. This doesn't respect the from parameter (so tests 3-6 fail); I will rewrite it from recursive to iterative so it can support that. 2011-11-15 11:12:06 +00:00
Roan Kattouw 8563e7e451 Add FIXME comment for a failing test and fix a typo in its description 2011-11-15 10:15:52 +00:00
Trevor Parscal ff07930171 Added test for prepareRemoval which fails atm, because strip doesn't drop nodes that are covered completely. Also cleaned up some comments in prepareRemoval 2011-11-15 01:15:21 +00:00
Trevor Parscal 482d477449 Added test for prepareRemoval which fails atm, because strip doesn't drop nodes that are covered completely. 2011-11-15 01:04:37 +00:00
Trevor Parscal ba64cfaf46 Moved tests for es.TransactionProcessor to their own file 2011-11-14 23:10:00 +00:00
Trevor Parscal 2494c40297 Moved transaction processing code to new class, es.TransactionProcessor 2011-11-14 23:04:36 +00:00
Trevor Parscal 713a80596d Added es.DocumentLeafNode, which like es.DocumentBranchNode is a mixin-like class (we may want to switch to using a more natural composition mechanism than es.extendClass in the future) - now es.DocumentNode also has an abstract method called hasChildren which returns a boolean and can indicate if a node is a leaf or a branch. 2011-11-10 19:26:02 +00:00
Roan Kattouw a4f71ace69 Rewrite the remove() function in es.DocumentModel.operations such that the tests added in r102564 pass now 2011-11-10 15:50:59 +00:00
Roan Kattouw aa7a6e2605 Add globalRange property to the output of selectNodes(), which translates the range property to be relative to the root rather than to the node. Update tests for this, and fix the test case numbering for selectNodes 2011-11-10 13:15:55 +00:00
Trevor Parscal 4bf41fc3e8 Updated tests and test data to support listItem nodes being branches instead of leafs 2011-11-09 23:39:47 +00:00
Roan Kattouw 25a04133b0 Add test cases for inserting a paragraph break (</p><p>) in the middle of a paragraph. Interestingly, committing this insertion actually works, but rolling it back doesn't. 2011-11-09 20:24:13 +00:00
Trevor Parscal cd18698bbc Moved es.DocumentModelBranchNode tests to their own file 2011-11-04 21:16:20 +00:00
Trevor Parscal add7c23191 Added es.Transaction.optimize and added in a test that neilk sent a patch for 2011-11-04 20:38:47 +00:00
Roan Kattouw 124a36b942 Add a metric ton of (mostly generated) selectNodes tests, and change selectNodes a little bit to make them pass 2011-11-04 20:27:23 +00:00
Roan Kattouw 84c6b8925a Refactor the large data objects in es.DocumentModel.test.js out to es.testData.js so they can be shared with other tests 2011-11-04 20:11:51 +00:00
Trevor Parscal 04b7e80096 Prepare removal tests are working now that DocumentModelNode objects have a type property 2011-11-04 18:31:22 +00:00
Trevor Parscal 4963b05e14 Split tests up by method 2011-11-04 18:08:51 +00:00
Trevor Parscal 4d7cbded2c Minor cleanup 2011-11-04 17:54:02 +00:00
Trevor Parscal b6420fd327 Changed from using the Hype code-name to EditSurface 2011-11-04 17:47:54 +00:00
Trevor Parscal 36c6bee0a8 Moved es tests to their own folder 2011-11-04 17:47:09 +00:00
Trevor Parscal fcb3644f35 Reorganized a few methods to reduce duplication, improved documentation 2011-11-04 17:07:44 +00:00
Gabriel Wicke 06ca9f12fe Rename definitiondata to definitiondescription, minor fixes 2011-11-04 12:25:01 +00:00
Gabriel Wicke 63398b5749 Update parserTests to latest serializers 2011-11-04 07:45:05 +00:00