Commit graph

962 commits

Author SHA1 Message Date
Gabriel Wicke d918fa18ac Big token transform framework overhaul part 2
* Tokens are now immutable. The progress of transformations is tracked on
  chunks instead of tokens. Tokenizer output is cached and can be directly
  returned without a need for cloning. Transforms are required to clone or
  newly create tokens they are modifying.

* Expansions per chunk are now shared between equivalent frames via a cache
  stored on the chunk itself. Equivalence of frames is not yet ideal though,
  as right now a hash tree of *unexpanded* arguments is used. This should be
  switched to a hash of the fully expanded local parameters instead.

* There is now a vastly improved maybeSyncReturn wrapper for async transforms
  that either forwards processing to the iterative transformTokens if the
  current transform is still ongoing, or manages a recursive transformation if
  needed.

* Parameters for parser functions are now wrapped in abstract Params and
  ParserValue objects, which support some handy on-demand *value* expansions.
  Keys are always expanded. Parser functions are converted to use these
  interfaces, and now properly expand their values in the correct frame.
  Making this expansion lazier is certainly possible, but would complicate
  transformTokens and other token-handling machinery. Need to investigate if
  it would really be worth it. Dead branch elimination is certainly a bigger
  win overall.

* Complex recursive asynchronous expansions should now be closer to correct
  for both the iterative (transformTokens) and recursive (maybeSyncReturn
  after transformTokens has returned) code paths.

* Performance degraded slightly. There are no micro-optimizations done yet
  and the shared expansion cache still has a low hit rate. The progress
  tracking on chunks is not yet perfect, so there are likely a lot of unneeded
  re-expansions that can be easily eliminated. There is also more debug
  tracing right now. Obama currently expands in 54 seconds on my laptop.

Change-Id: I4a603f3d3c70ca657ebda9fbb8570269f943d6b6
2012-05-15 17:05:47 +02:00
Catrope c256ea7d71 Fix fatal error in parse.js
Trying something trivial like echo 'Hello world' | node parse.js
would throw TypeError: Function.prototype.apply: Arguments list has wrong type

Change-Id: Ia0a1154b0f3edbfb1f228a1d2072fced1b147141
2012-05-10 12:04:57 -07:00
Gabriel Wicke b1bd0d73ec Don't eat end token in ListHandler, and lazier Quote handler registration
* Setting the rank on tokens is still used currently, but will be phased out
  in favor of setting it on chunks. Tokens will be immutable to allow sharing
  and caching without a need for cloning.
* Only register for newline and end tokens in QuoteTransformer when active.

Change-Id: I2c45bc7e4a105219a1404ab221eed7f242128f1e
2012-05-10 09:47:53 +02:00
Adam Wight 0a7f0b7630 List markup is created during the sync23 phase.
This makes it possible to transclude list items from a template.

Note: "5 quotes" test is broken by this patch, it appears that ListHandler
newline processing is changing some state which mysteriously affects the
QuoteTransformer.  This is ominous, hopefully there's a simple explanation...

gwicke: fix a bug in tokenizer triggered by definition lists like this:
**; foo : bar

Change-Id: I4e3a86596fe9bffcbfc4bf22895362c3bf742bad
2012-05-08 11:39:36 +02:00
Gabriel Wicke 909633ea08 Improve template / tplarg precedence in tokenizer
Change-Id: If9b24b42ea223e0f30f906a83496d73ec60c4a0d
2012-05-04 13:17:06 +02:00
Gabriel Wicke 8a30f76370 Use upright option, including the 0.75 default width
Change-Id: Iacdf6173e0ee8f58ca4385fd9b2cde77b2fdf3c4
2012-05-04 11:15:35 +02:00
Gabriel Wicke 57dfd89383 Handle upright option properly
Change-Id: I831fcccf874f9a0505e88eb76d269b1d2f68e3e0
2012-05-03 16:15:34 +02:00
Gabriel Wicke c4fc7508a7 Add basic # REDIRECT handling
Change-Id: I71f659201c1d5de4a528ddfac7f65bf20a89f97d
2012-05-03 15:54:36 +02:00
Gabriel Wicke 6ab017308b Only specify the width for thumbnails to keep the aspect ratio
Change-Id: I4e55ff719da6cb58f396ad6043e46acaed4a504d
2012-05-03 15:36:42 +02:00
Gabriel Wicke 6139398494 Reduce debugging overhead a bit, and provide default internal image size
Change-Id: I345af8c5905a5fa747f9ed342ba2ba8c1026d044
2012-05-03 14:49:55 +02:00
Gabriel Wicke 6e21f6bb27 Forward-port Cite extension
* Adapted Cite extension to use current interfaces and token formats
* Improved TokenCollector

Change-Id: I20419b19edd9bbad2c2abf17a2ff1411b99c0c04
2012-05-03 13:22:01 +02:00
Inez Korczynski d6ae8390f5 Get rid of selectionDirection. Introduce getDirection() methdo in
ve.Range.

Change-Id: Iaf11b2dbfb7ae82a7f54ee205cd6cdc8ee235aef
2012-04-27 17:36:55 -07:00
Inez Korczynski af6a9f9ccc Created a named method inside a Surface (instead of anonymouse one) to
handle logic for rangeChange event handler.

Change-Id: Ief32e647f9399e3ea47c5613902cebcbaaf4874c
2012-04-27 17:31:49 -07:00
Gabriel Wicke 2291fe8364 Reduce the need for token cloning slightly
Change-Id: I31c71bddca4855afdffc3fe5c8d759cfa1994d86
2012-04-27 23:12:25 +02:00
Trevor Parscal f19897fefa Merge "Build out ve.Surface constructor to support multiple editor instances Now setting up multiple toolbars per config Tools & Modes are now configurable per toolbar per instance Base elements are created on demand and no longer id specific Note: There are some bugs with multiple instances." 2012-04-27 21:01:10 +00:00
Gabriel Wicke 5fb2c46073 Clone cached tokens, and fix switch for empty needle
Change-Id: I63946e5a56f6fd7dd30d00b12d36032dd1dd0017
2012-04-27 15:59:01 +02:00
Gabriel Wicke ed8cb54831 Simplify transformToken slightly, and fix JSHint warnings
Change-Id: I95769ed063ea855a9109148f5db83ea43f423e56
2012-04-27 15:31:30 +02:00
Gabriel Wicke 2d7b4a2a59 Make .to more consistent and add optional parentCB arg
* parentCB (if set) is called with { async: true } if expansion is going to be
  asynchronous.
* Strings are handled efficiently
* all value parameter chunks can now be converted using .to().

Change-Id: Ib013e1bc3d8e7f692009038209db6a056887326e
2012-04-27 13:57:23 +02:00
Gabriel Wicke fd1a67aa16 Add .to('text/plain/expanded', cb) support and convert ifeq to use it
Change-Id: I99c78de12fed41ba36811402f7ecacb420391d70
2012-04-27 12:18:30 +02:00
Gabriel Wicke 30a83d7fd7 Accept wikilink parameters with dangling equal ('|arg=|')
Change-Id: Ib4f6d186da2a74522b17c377dac5c9a7de7e5861
2012-04-27 11:35:00 +02:00
Gabriel Wicke 1d70e7b81c Disable preformatted text from indents in template args
Change-Id: I84144d3fab6541ed264d9b092806c8bf9de6e8b2
2012-04-27 10:45:08 +02:00
Inez Korczynski f188772259 Introduce new method called "proxy" in surfaceView to avoid using the same
construct with anonynous function over and over.

Change-Id: I1e96cf1efaa6fa5d551fdfa8bb5a80c31e519579
2012-04-26 14:49:12 -07:00
Rob Moen 94479bd79d Build out ve.Surface constructor to support multiple editor instances
Now setting up multiple toolbars per config
Tools & Modes are now configurable per toolbar per instance
Base elements are created on demand and no longer id specific
Note: There are some bugs with multiple instances.

Change-Id: Id0bbbca2d1b76fd2db3f3b0f9abd90194930b610
2012-04-26 11:56:47 -07:00
Gabriel Wicke 56d6757f67 Fixes for the template fetch retry feature
Change-Id: Id36cb02c535d07f4f2cdd54ae682b6a144a2faa9
2012-04-26 20:31:23 +02:00
Gabriel Wicke 027d77e0c9 Fix --wikidom and --linearmodel parse.js options; retry on template fetch failures
Change-Id: I444397936fd87971fe085df4b467089367e9ffa6
2012-04-26 19:51:00 +02:00
Gabriel Wicke 3be4992782 'Obama finally expands' ;) Misc fixes and documentation updates
* [[:en:Barack Obama]] can now be expanded in 77 seconds using 330MB RAM,
  while it would prevously run out of RAM after ~30 minutes. Wohoooo!
  The token transform framework rework really paid off.
* 303 parser tests are passing in the new record time of 5.5 seconds. Two more
  tests are passing since these tests expect the day of the week to be
  Thursday.  Won't be the case tomorrow.

Change-Id: I56e850838476b546df10c6a239c8c9e29a1a3136
2012-04-26 18:18:08 +02:00
Gabriel Wicke 8ff810659a Rename text/wiki and tokens/wiki to text/x-mediawiki and similar
Change-Id: I70113629f4633685cd6db3914303a15e4c79a50a
2012-04-25 20:19:43 +02:00
Gabriel Wicke 814511f523 Remove dead parser pipeline code
Change-Id: I802f1798d5163c1ce82d648f739c2e79b17eda41
2012-04-25 17:12:32 +02:00
Gabriel Wicke 5a3f5544a5 Merge "Biggish token transform system refactoring" 2012-04-25 15:07:44 +00:00
Demon 5feb5ebcbf Merge "Fix typo" 2012-04-25 14:51:47 +00:00
Gabriel Wicke 8368e17d6a Biggish token transform system refactoring
* All parser pipelines including tokenizer and DOM stuff are now constructed
  from a 'recipe' data structure in a ParserPipelineFactory.

* All sub-pipelines of these can now be cached

* Event registrations to a pipeline are directly forwarded to the last
  pipeline member to save relatively expensive event forwarding.

* Some APIs for on-demand expansion / format conversion of parameters from
  parser functions are added:

  param.to('tokens/expanded', cb)
  param.to('text/wiki', cb) (this does not work yet)

  All parameters are additionally wrapped into a Param object that provides
  method for positional parameter naming (.named() or conversion to a dict
  (.dict()).

* The async token transform manager is now separated from a frame object, with
  the frame holding arguments, an on-demand expansion method and loop checks.

* Only keys of template parameters are now expanded. Parser functions or
  template arguments trigger an expansion on-demand. This (unsurprisingly)
  makes a big performance difference with typical switch-heavy template
  systems.

* Return values from async transforms are no longer used in favor of plain
  callbacks. This saves the complication of having to maintain two code paths.
  A trick in transformTokens still avoids the construction of unneeded
  TokenAccumulators.

* The results of template expansions are no longer buffered.

* 301 parser tests are passing

Known issues:

* Cosmetic cleanup remains to do
* Some parser functions do not support async expansions yet, and need to be
  modified.

Change-Id: I1a7690baffbe8141cadf67270904a1b2e1df879a
2012-04-25 16:51:36 +02:00
Demon 28e44b1d0f Merge "Add --wikidom flag to parse.js" 2012-04-25 14:18:59 +00:00
Catrope 47969e20a1 Add --wikidom flag to parse.js
Also remove unused import of DOMConverter

Change-Id: I1eabe6bf9935970c1f049681b52e867a510ea77a
2012-04-23 15:01:12 -07:00
Trevor Parscal 8ce68e1ac8 Merge "Modify rangeChange event to save selection direction. Renamed Selection method to more suitable name. Misc cleanup Patchset 2, whitespace cleanup Patchset 3: Change values used with selection direction to -1 or 1 1 for left to right (normal) -1 for right to left (opposite) Change-Id: If9ecc721ace1c7550903170f92395947f1ccc22c" 2012-04-20 23:29:21 +00:00
Rob Moen 5fc9f1c7e4 Modify rangeChange event to save selection direction.
Renamed Selection method to more suitable name.
Misc cleanup
Patchset 2, whitespace cleanup
Patchset 3: Change values used with selection direction to -1 or 1
1 for left to right (normal)
-1 for right to left (opposite)
Change-Id: If9ecc721ace1c7550903170f92395947f1ccc22c
2012-04-20 16:27:26 -07:00
Trevor Parscal 29d1ebeca7 Merge "Put a space in the toolbarDropdownTool-label div for default Addresses dropdown tool ui inconsistency on load" 2012-04-20 22:56:36 +00:00
Rob Moen 8398696fe0 Put a space in the toolbarDropdownTool-label div for default
Addresses dropdown tool ui inconsistency on load

Change-Id: I855ac15e939fa895adb67daaeb45aadbac01f10b
2012-04-19 15:31:09 -07:00
Rob Moen 1a68c42049 Modify VE constructor to have the default set of tool configuration
Configuration options are to extend base options in the constructor.

Change-Id: Ic430a6489d8cf9a703e374c3f416feaf0e3d2521
2012-04-19 15:14:57 -07:00
Gabriel Wicke e2ca8c24c7 Delay some token duplication until actual mutation happens
This is a bit better than cloning tokens wholesale, but not by much. There is
a lot of potential for much better per-token caching with reduced token
cloning. Need to map out all dependencies besides token attributes expanded
from template parameters or other scoped state. Even if tokens themselves
don't need transformation, they might still need to be considered for other
token transformers, so simply keeping the final rank won't quite work even if
the token itself is fully transformed. As a minimum, a shallow clone would
need to be made and the rank reset (as in env.cloneTokens).

Change-Id: I4329113bb21750bae9a635229ed1b08da75dc614
2012-04-18 17:53:04 +02:00
Gabriel Wicke bf84638bc0 Add tokenizer cache and clone token state on mutation
* Added an LRU cache (using the lru-cache node module) for tokenizer output
* Mutation of nested attributes now replaces the containers. A shallow copy of
  tokens is sufficient to isolate token transformations. Need to investigate
  if we can actually get away without isolation and re-transformation for most
  ordinary tokens.

Change-Id: I9136b1d7a1fbcc538183a319d4ecaa290d616fdf
2012-04-18 14:40:47 +02:00
Catrope 80e383c346 Merge "Removed line-height from preview panel" 2012-04-17 21:05:48 +00:00
Catrope fa9e02cfad Merge "Improved the appearance of the warning at the top of the editor" 2012-04-17 20:36:44 +00:00
Gabriel Wicke aaca5eac7d More tweaks: safesubst and image options
* Ignore safesubst for now
* Remove an unneeded whitelist entry
* Make sure the caption is not lost for thumbs (fix to last commit) and remove
  debug print

Change-Id: I243584ed0838cf7c3b4110fe9cdf869272477312
2012-04-17 11:02:52 +02:00
Gabriel Wicke 7fe5a86b60 Improve image option handling
Change-Id: If1376766f41ff1288bfe2af19beecd3299c09a01
2012-04-17 10:46:20 +02:00
Catrope 4b6e1401a3 Fix typo
Function was renamed but error message wasn't updated

Change-Id: I61a9effa8dedcbdbc75c5c6842fb05f909561327
2012-04-16 12:20:16 -07:00
Gabriel Wicke afa5b95bc1 Don't work around html5 library tokenizer attribute reordering
The HTML5 parser we are using to normalize expected HTML output in parserTests
reverses the order of attributes (see
https://github.com/aredridel/html5/pull/53 for the fix). Remove whitelist
entries concerned with this and use the proper order in external image
attributes.

Change-Id: If1868cae05396a150757c85a20473ab756cbcd97
2012-04-16 17:09:06 +02:00
Gabriel Wicke c688b039de Collected tweaks
* less verbose logging in noinclude processing and template expansion
* Give priority to the processing of templates transcluded from transclusions
  to get closer to depth-first processing. This serves to minimize memory
  usage from queued-up tokens.
* Increase the maximum outstanding requests per template retrieval. 10000
  amazingly proved too low a limit on some big pages.
* Only process a single template request callback at a time for now
* Add a debug print in the treebuilder wrapper
* Don't treat multiple comments on a single line as a single comment to match
  the PHP parser's behavior

Change-Id: I9a86b6d7bec3b9e1f17415daf1bf74170240721a
2012-04-16 15:47:03 +02:00
Gabriel Wicke 1bf8a9e5e1 Small tweak in comment about onlyinclude forcing buffered expansion
Change-Id: Ib324e24c51c97e07e6737bf23f16db07043b69ab
2012-04-16 15:42:29 +02:00
Gabriel Wicke efd4c026ea Disallow < and > in external link urls
Change-Id: Id865c3d46b33b182bb5b244e77e815c0afd7fa49
2012-04-16 15:36:56 +02:00
Gabriel Wicke 25523f4cf0 Implement urlencode parser function
Change-Id: I4fca3134c9c3eb9a7d6f3360be6de054fb47477c
2012-04-16 14:54:03 +02:00