Commit graph

673 commits

Author SHA1 Message Date
Roan Kattouw 6832be68ed Fix test #30: was failing because getScope() was broken and insert() didn't account for the case of inserting something like </list><list> at a structural offset. All tests are now passing, yay!
* Fix getScope()
** Drop the -1 which caused the result to be off by one level
** Prevent JS errors from occurring if bad input causes the loop to try to traverse up above the root node
* insert()
** Detect the case where the input data tries to close the containing element; in that case, we'll get scope != node
** Move the getNodeFromOffset() and getScope() calls up and out of the conditionals
** Remove unnecessary parent==model conditional, no longer needed now that getScope() can safely handle things that try to traverse too far up
** Add some comments to explain what's going on. I'll restructure this function a bit more shortly
2012-01-31 16:43:21 +00:00
Neil Kandalgaonkar 2688f823ef added dependencies to README 2012-01-31 00:56:07 +00:00
Neil Kandalgaonkar f0b934ef2e first pass at an API method that returns wikidom. Shells out to node. Some issues with XML API result formatting but works fine in JSON 2012-01-31 00:02:48 +00:00
Gabriel Wicke 7cd94df47d A few minor tweaks to reduce memory usage 2012-01-27 13:32:44 +00:00
Trevor Parscal 94f7d79eb7 Skip traversal of leaf nodes if there aren't any children 2012-01-23 18:46:31 +00:00
Gabriel Wicke 4e6a54560a * Emit token chunks for top-level block elements by patching the source of the
tokenizer
* Fix a bug uncovered by this
* Increase the number of outstanding listeners on a single download to 10000
2012-01-22 23:21:53 +00:00
Gabriel Wicke 7ea4d7d3db A few parser function fixes and maximum template expansion in environment
config.
2012-01-22 19:32:28 +00:00
Gabriel Wicke 561cf3c237 Bug fixes and a first stab at a #time parser function. You can expand the main
page like this:

cd extensions/VisualEditor/modules/parser
echo '{{:Main Page}}' | node parse.js
echo '{{:Main Page}}' | node parse.js --html
echo '{{:Main Page}}' | node parse.js --debug

Even the date-based includes work somewhat, although they don't yet accept
passed-in dates.
2012-01-22 07:07:16 +00:00
Gabriel Wicke 60e45bb739 A bit of template expansion bug fixing and parser function documentation 2012-01-22 01:27:22 +00:00
Gabriel Wicke e8a7034acf Add some commandline switches to parse.js. Supports switching on/off debug
mode and a selection of html/WikiDom serialization.
2012-01-21 22:42:54 +00:00
Gabriel Wicke 785a4af76f Implement a few parser functions. 220 parser tests now passing. 2012-01-21 20:38:13 +00:00
Gabriel Wicke 1a6546fbca Support empty template arguments and default values in arg expansion 2012-01-21 03:03:33 +00:00
Gabriel Wicke fdd048b3b2 Remove a few stray debug prints and disable debugging in parse.js 2012-01-20 22:21:33 +00:00
Gabriel Wicke 145df2655c * NoInclude and IncludeOnly improvements
* Tokenizer support for templates and template args in template arguments and titles
* Async attribute expansion fixes
2012-01-20 22:02:23 +00:00
Gabriel Wicke 348cac6cf0 Fix a bug in TokenCollector, and misc tweaks for template expansions. 2012-01-20 18:47:17 +00:00
Gabriel Wicke 7cc8e69147 Collapse all requests per template into a single outstanding request using an
event-emitting TemplateRequest object and a request queue.
2012-01-20 02:36:18 +00:00
Gabriel Wicke fc2088bb21 Add some rudimentary noinclude / includeonly support and fix up
TokenCollector.
2012-01-20 01:46:16 +00:00
Gabriel Wicke c15e0d4167 Minor cleanup in TemplateHandler 2012-01-20 00:49:27 +00:00
Gabriel Wicke d0ece16c86 Fix async template expansion, so we can now render simple pages with templates
directly to WikiDom from enwiki using a commandline like this:

  echo '{{User:GWicke/Test}}' | node parse.js

Wohoo!

Complex pages with templates won't render properly yet, as noinclude /
includeonly and parser functions are not yet implemented. As a result, the
parser will run out of memory or hit the currently low expansion depth limit
as it tries to expand documentation for all templates.
2012-01-19 23:43:39 +00:00
Gabriel Wicke 2233d0a488 Eventify parser tests and parse.js commandline wrapper to actuallly allow
async template fetching. Async expansion is not yet fully debugged, but at
least the preconditions for that are now there.
2012-01-18 23:46:01 +00:00
Gabriel Wicke 5b8054636e Make template fetching somewhat functional on node with Inez' help, but
disable it by default in parserTests as it tries to fetch all sorts of parser
functions and is not yet fully supported in parserTests. The next step will be
to build a list of parser functions (to avoid fetching them as templates) and
pushing the event interface into parserTests.
2012-01-18 19:38:32 +00:00
Gabriel Wicke 4bd4307924 Fix comment to reflect the actual regexp/spec in the JS version as well. 2012-01-18 19:35:13 +00:00
Gabriel Wicke 14e6728cc4 Add the start of a minimal sanitizer stage, that only strips IDN ignored
characters from host portions of links hrefs for now. This module needs to be
filled up with pretty much everything Sanitizer.php does, including tag and
attribute whitelists and attribute value sanitation (especially for style
attributes).

We'll also need to think about round-tripping of sanitized tokens.
2012-01-18 01:42:56 +00:00
Gabriel Wicke 336be4f617 Eat '[[[' as plain text token, makes it 212 passing. 2012-01-18 00:23:17 +00:00
Gabriel Wicke 178adbc342 Accept IPv6 (and IPv4) addresses in the tokenizer, so another test passes. 2012-01-18 00:00:47 +00:00
Gabriel Wicke e7381da5b8 Trim whitespace off template titles and argument names. 209 parser tests now
passing.
2012-01-17 23:18:33 +00:00
Gabriel Wicke f50fecf1e3 Fix template argument expansion. 200 parser tests now passing. 2012-01-17 22:29:26 +00:00
Gabriel Wicke 34025251a3 Clean up 'END' token handling a bit. 2012-01-17 20:01:21 +00:00
Gabriel Wicke 7f579398c7 Use isBlockTag in DOMPostProcessor 2012-01-17 18:30:22 +00:00
Gabriel Wicke 6bd7ca1e75 Misc improvements, now 196 parser tests passing.
* Add handler for post-expand paragraph wrapping on token stream, to handle
  things like comments on its own line post-expand
* Add general Util module
* Fix self-closing tag handling in HTML5 tree builder
2012-01-17 18:22:10 +00:00
Gabriel Wicke f4081bef08 First template expansion tests start working, and a bug fix in
DOMPostProcessor paragraph wrapper. 187 parser tests now passing.
2012-01-14 00:58:20 +00:00
Gabriel Wicke 196d704e8e Template expansion now enabled and somewhat working, but template fetching
still fails all the time.
2012-01-13 18:48:25 +00:00
Gabriel Wicke 32c9bccd7c Results of early template expansion debugging. Still disabled by default, but
getting closer.
2012-01-11 19:48:49 +00:00
Gabriel Wicke 6b6ec2933d More work towards template expansion.
* Created AttributeTokenTransformManager for generic attribute conversion, and
  removed { title, template argument {key, value} } expansion from
  TemplateHandler.
* Added caching for attribute and input sub-pipelines. Especially attribute
  pipelines would otherwise be recreated for each attribute value and key.
2012-01-11 00:05:51 +00:00
Gabriel Wicke 5ec30252f1 More token transform and pipeline setup refactoring to support template
expansion better.
2012-01-10 01:09:50 +00:00
Gabriel Wicke 287604c422 A bit of cleanup in ParserPipeline, with better and more consistent support
for multiple input types.
2012-01-09 19:33:49 +00:00
Gabriel Wicke becf3cb7ea Add generic 'collect all tokens between delimiter tokens and call a transform
function on it' util for synchronous transformation phases. This can be used
to implement parser hooks (aka extension tags) besides other things.
2012-01-09 18:13:45 +00:00
Gabriel Wicke e99d7a2a55 Two batteries worth of token transform manager refactoring.
* TokenTransformDispatcher is now renamed to TokenTransformManager, and is
  also turned into a base class
* SyncTokenTransformManager and AsyncTokenTransformManager subclass
  TokenTransformManager and implement synchronous (phase 1,3) and asynchronous
  (phase 2) transformation stages.
* Communication between stages uses the same chunk / end events as all the
  other token stages.
* The AsyncTokenTransformManager now supports the creation of nested
  AsyncTokenTransformManagers for template expansion.
  The AsyncTokenTransformManager object takes on the responsibilities of a
  preprocessor frame. Transforms are newly created (or potentially resurrected
  from a cache), so that transforms do not have to worry about concurrency.
* The environment is pushed through to all transform managers and the
  individual transforms.
2012-01-09 17:49:16 +00:00
Gabriel Wicke 6601c544e6 Handle default for template arg expansion, add template fetch functionality
and tweak a few minor things in the grammar and QuoteTransformer.
2012-01-06 17:19:14 +00:00
Gabriel Wicke f0c844f28f Add template expansion handler skeleton, not yet functional. Also note
improvements needed in the tokenizer template handling.
2012-01-06 14:30:55 +00:00
Mark A. Hershberger 381551e039 w/s 2012-01-04 17:46:24 +00:00
Mark A. Hershberger 3a9a4cf322 re r106536 remove !transparent 2012-01-04 17:44:14 +00:00
Gabriel Wicke 2e35171fd1 Fix quote handling and tweak the whitelist a bit. 'any' token registrations
are now merged with specific registrations by rank. Not yet clear if that is a
good idea overall, need to check use cases when implementing template expansion
and other functionality.

183 parser test now passing.
2012-01-04 14:09:05 +00:00
Gabriel Wicke 6cd95fea37 Fix up constructors in EventEmitter inheritance and tweak a few more comments. 2012-01-04 12:28:41 +00:00
Gabriel Wicke e3ae9a702b Fix JSHint warnings (mostly about comment indentation) from r108012. 2012-01-04 11:06:24 +00:00
Gabriel Wicke 4c4a24f0a0 Hook up the DOMPostProcessor using events as well, and rename the subscription
methods to tell a story. Also document idea on how to dynamically configure
the pipeline depending on event registrations in comment.
2012-01-04 11:00:54 +00:00
Gabriel Wicke f0399d2ec5 Clean up comments in TokenTransformDispatcher and mark private methods with
underscore.
2012-01-04 09:48:24 +00:00
Gabriel Wicke ee79158e53 Add trailing newline in commandline parser wrapper 2012-01-04 08:42:53 +00:00
Gabriel Wicke 29362cc53c Rename ParseThingy to ParserPipeline and fix up broken WikiDom generation and
commandline runner.
2012-01-04 08:39:45 +00:00
Gabriel Wicke bd98eb4c5a Land big TokenTransformDispatcher and eventization refactoring.
The TokenTransformDispatcher now actually implements an asynchronous, phased
token transformation framework as described in
https://www.mediawiki.org/wiki/Future/Parser_development/Token_stream_transformations.

Additionally, the parser pipeline is now mostly held together using events.
The tokenizer still emits a lame single events with all tokens, as block-level
emission failed with scoping issues specific to the PEGJS parser generator.
All stages clean up when receiving the end tokens, so that the full pipeline
can be used for repeated parsing.

The QuoteTransformer is not yet 100% fixed to work with the new interface, and
the Cite extension is disabled for now pending adaptation. Bold-italic related
tests are failing currently.
2012-01-03 18:44:31 +00:00
Neil Kandalgaonkar 9d198ecad6 when nothing to undo or redo, grey out appropriate buttons - fix bug #33112, based on patch from ashish.dubey91@gmail.com 2011-12-31 01:44:34 +00:00
Neil Kandalgaonkar 20374b5911 fix substr for IE, followup r107464 2011-12-30 21:51:03 +00:00
Gabriel Wicke 8e00a72d0a Improvements to link trail handling, and two tweaks to the whitelist. 182
tests now passing. 

Link trails depend on language-dependent positive character classes in the PHP
parser. These classes all seem to disallow punctuation implicitly and list
differing plain text characters instead, so it might be possible to get away
with identifying a common class of non-trail punctuation instead. This would
help to keep the tokenizer independent of configurations, which is very
desirable for caching and simplified external parsing.
2011-12-30 12:47:06 +00:00
Gabriel Wicke 11ece76b7b Fix suffix handling for wiki links. 2011-12-30 09:35:57 +00:00
Gabriel Wicke b3a0270d69 Remove env and load grammar in tokenizer constructor. Re-add property hack to
keep parserTests running for now. Really need a different pipeline for html
serialization or a reference to the HTML DOM.
2011-12-28 17:04:16 +00:00
Gabriel Wicke 3a63fb118e Add a few comments inline, and remove unneeded html serialization as we are
only interested in WikiDom output in this parser wrapper.
2011-12-28 13:46:52 +00:00
Neil Kandalgaonkar 8fbf36e63e put add terminal token inside tokenize method (will pull it out again for streaming interface) 2011-12-28 01:37:15 +00:00
Neil Kandalgaonkar 6103646ec8 remove need to add newline at end of input 2011-12-28 01:37:11 +00:00
Neil Kandalgaonkar 4158f82d7e refactor parser to ParseThingy in different module, can be invoked with command line utility parse.js 2011-12-28 01:37:06 +00:00
Neil Kandalgaonkar d91a67ba99 nodeName not defined 2011-12-28 01:36:54 +00:00
Neil Kandalgaonkar 962d1262fc create tokenizer without need to modify namespace with PEG source 2011-12-28 01:36:36 +00:00
Gabriel Wicke 33e60dd4d9 Update comments a bit. 2011-12-22 12:37:24 +00:00
Gabriel Wicke 9ee0e660ec Fix regression introduced by r107060 for regular table cells. Good to have a
test suite ;)
2011-12-22 12:09:25 +00:00
Gabriel Wicke a94d0ec10c Re-add support for row-only tables. 2011-12-22 11:58:32 +00:00
Gabriel Wicke 1c7fe0eb34 Refactor table productions to support table fragments in templates (table
start / row / table end). The old productions are not deleted yet to make it
easy to compare the output on more complex articles. 181 tests passing after
adding two table tests with whitespace-only differences to the whitelist.
2011-12-22 11:43:55 +00:00
Gabriel Wicke 2845ba9552 Handle noinclude and includeonly at start of line, so that syntax after it
still matches as if it actually was preceded by a newline.
2011-12-21 11:38:50 +00:00
Mark A. Hershberger 752130ab74 Bug 33113 - Have buttons that are grayed out disabled completely
Author: joan.creus.c@gmail.com
2011-12-17 23:58:21 +00:00
Gabriel Wicke 3a631db6d9 Fix ranges for annotations in implicit paragraphs within branch nodes. 2011-12-16 19:36:04 +00:00
Gabriel Wicke cc06551f2e Rename table_header production to table_heading. Those non-natives strike again. 2011-12-16 19:24:59 +00:00
Gabriel Wicke 605ed23fd2 Fix attributes in table headings. 2011-12-16 19:22:13 +00:00
Gabriel Wicke 08255ff3e6 Small bug fix to heading level, spotted by Mike from localwiki- thanks! 2011-12-15 23:59:35 +00:00
Gabriel Wicke a04744b2ec Add some more attribute remapping capabilities to the DOMConverter, and clean
up some grammar formatting.
2011-12-15 17:33:07 +00:00
Gabriel Wicke e98dd9e722 Implement 1-char-minimum width for annotations, and some additonal minor
cleanup.
2011-12-15 11:05:52 +00:00
Gabriel Wicke 22ba27295b Clean up the DOMConverter a bit. 2011-12-15 10:55:30 +00:00
Gabriel Wicke e72dee76e4 Follow-up to r106208 and r106207. Both good catches, thanks Yair! As this code
is in its early stages and nowhere near deployment, please Be Bold and just
commit things like this directly! IMHO it makes more sense to fully review this
once it settles down a bit.
2011-12-15 10:13:50 +00:00
Gabriel Wicke 3585bd9c8e Accept row-only tables. The parser now eats [[en:Barack Obama]] as-is. Hooray! 2011-12-15 00:39:28 +00:00
Gabriel Wicke 6df94a34a1 Less lust for urls 2011-12-15 00:26:22 +00:00
Gabriel Wicke ce2ee067f7 Minor tweak to wiki link production 2011-12-15 00:12:58 +00:00
Gabriel Wicke 377226a120 Comment out a stray console.log 2011-12-14 23:44:58 +00:00
Gabriel Wicke 574abd9774 A collection of small bug fixes to the grammar, Cite, the Token format
converter and the HTML DOM -> WikiDom converter. The tokenizer now digests all
parserTests.
2011-12-14 23:38:46 +00:00
Trevor Parscal 0342eb034d Fixed help panel content where we claimed the alt key was to be used for word/block selection, but it should have been ctrl/option key - also changed clt to ctrl. 2011-12-14 19:15:02 +00:00
Trevor Parscal 64754b8200 Added autocapitalize="off" attribute to text area input so IOS doesn't capitalize everything. 2011-12-14 18:54:36 +00:00
Gabriel Wicke dc77d73ad5 Add ability to pass through JSON data to WikiDom in data-json-* attributes,
and fix parser to actually parse the Barack Obama article except for one table
with nested templates at the start-of-line.
2011-12-14 17:25:09 +00:00
Gabriel Wicke f6e4267fca Handle a few more element types, and reset offset for each leaf node. Not sure
if the latter is correct, as the documentation at
https://www.mediawiki.org/wiki/Visual_editor/Software_design#Data_Structures
and the actual sample WikiDom in the editor sandbox seem to disagree on this
point.
2011-12-14 16:22:27 +00:00
Gabriel Wicke 6676a47008 Add implicit level attribute to WikiDom headings. 2011-12-14 15:55:58 +00:00
Gabriel Wicke 3018ca690b Improve WikiDom conversion: Handle text and annotations in branch nodes as
paragraphs and treat list items as branches.
2011-12-14 15:40:40 +00:00
Gabriel Wicke a09aa4d599 Add rough HTML DOM to WikiDom conversion. You can see serialized WikiDom of
parser tests using 'node parserTests.js --wikidom'.
2011-12-14 15:15:41 +00:00
Gabriel Wicke 5f80d30428 Clean up access to document and body after building the tree. 2011-12-14 09:40:49 +00:00
Gabriel Wicke 30749b8d8d Update comments a bit and add a note on things to improve in API. 2011-12-14 09:33:25 +00:00
Neil Kandalgaonkar 932eade938 add buglist ang bug reporting links to feedback form 2011-12-14 01:32:07 +00:00
Trevor Parscal 74ff2981cf Added blur handler for window which resets the shift key tracker 2011-12-13 23:22:19 +00:00
Trevor Parscal 8e10485a0c Fixes issue with r106123 where creating new links wasn't working anymore with the button 2011-12-13 23:15:31 +00:00
Trevor Parscal fef6d5525e - Added auto-link selection when opening the link editor without selecting any text
- Resolves bug #33049
2011-12-13 23:12:27 +00:00
Gabriel Wicke 55ff272847 Comment TokenTransformDispatcher. 2011-12-13 20:13:09 +00:00
Gabriel Wicke 44deefe303 Minor tweak to comment. 2011-12-13 18:55:44 +00:00
Gabriel Wicke c61b32eaa7 Clean up and comment the Cite extension a bit. 2011-12-13 18:45:09 +00:00
Trevor Parscal 7e3401b777 Renamed, merged and disabled some example documents 2011-12-13 17:49:42 +00:00
Trevor Parscal acb7d042d2 Updated icons (includes new help icon) 2011-12-13 17:39:36 +00:00
Gabriel Wicke feee9ded9f Convert the Cite extension to a token stream transformer.
This required a few further additions to the TokenTransformDispatcher. In
particular, there is now an 'any' token match whose callbacks are executed
before more specific callbacks. This is used by the Cite extension to eat all
tokens between ref and /ref tags. This need is very common, so should be
broken out to an intermediate layer in the future.

In general, the requirements for the TokenTransformDispatcher API are now
clearer, and the API should likely be cleaned up / simplified.
2011-12-13 14:48:47 +00:00
Gabriel Wicke 8e55e79b67 Rename TokenTransformer to TokenTransformDispatcher. 2011-12-13 11:45:12 +00:00