wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-11-30 09:04:21 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	2d7b4a2a59	Make .to more consistent and add optional parentCB arg * parentCB (if set) is called with { async: true } if expansion is going to be asynchronous. * Strings are handled efficiently * all value parameter chunks can now be converted using .to(). Change-Id: Ib013e1bc3d8e7f692009038209db6a056887326e	2012-04-27 13:57:23 +02:00
Gabriel Wicke	fd1a67aa16	Add .to('text/plain/expanded', cb) support and convert ifeq to use it Change-Id: I99c78de12fed41ba36811402f7ecacb420391d70	2012-04-27 12:18:30 +02:00
Gabriel Wicke	30a83d7fd7	Accept wikilink parameters with dangling equal ('\|arg=\|') Change-Id: Ib4f6d186da2a74522b17c377dac5c9a7de7e5861	2012-04-27 11:35:00 +02:00
Gabriel Wicke	1d70e7b81c	Disable preformatted text from indents in template args Change-Id: I84144d3fab6541ed264d9b092806c8bf9de6e8b2	2012-04-27 10:45:08 +02:00
Gabriel Wicke	56d6757f67	Fixes for the template fetch retry feature Change-Id: Id36cb02c535d07f4f2cdd54ae682b6a144a2faa9	2012-04-26 20:31:23 +02:00
Gabriel Wicke	027d77e0c9	Fix --wikidom and --linearmodel parse.js options; retry on template fetch failures Change-Id: I444397936fd87971fe085df4b467089367e9ffa6	2012-04-26 19:51:00 +02:00
Gabriel Wicke	3be4992782	'Obama finally expands' ;) Misc fixes and documentation updates * [[:en:Barack Obama]] can now be expanded in 77 seconds using 330MB RAM, while it would prevously run out of RAM after ~30 minutes. Wohoooo! The token transform framework rework really paid off. * 303 parser tests are passing in the new record time of 5.5 seconds. Two more tests are passing since these tests expect the day of the week to be Thursday. Won't be the case tomorrow. Change-Id: I56e850838476b546df10c6a239c8c9e29a1a3136	2012-04-26 18:18:08 +02:00
Gabriel Wicke	8ff810659a	Rename text/wiki and tokens/wiki to text/x-mediawiki and similar Change-Id: I70113629f4633685cd6db3914303a15e4c79a50a	2012-04-25 20:19:43 +02:00
Gabriel Wicke	814511f523	Remove dead parser pipeline code Change-Id: I802f1798d5163c1ce82d648f739c2e79b17eda41	2012-04-25 17:12:32 +02:00
Gabriel Wicke	8368e17d6a	Biggish token transform system refactoring * All parser pipelines including tokenizer and DOM stuff are now constructed from a 'recipe' data structure in a ParserPipelineFactory. * All sub-pipelines of these can now be cached * Event registrations to a pipeline are directly forwarded to the last pipeline member to save relatively expensive event forwarding. * Some APIs for on-demand expansion / format conversion of parameters from parser functions are added: param.to('tokens/expanded', cb) param.to('text/wiki', cb) (this does not work yet) All parameters are additionally wrapped into a Param object that provides method for positional parameter naming (.named() or conversion to a dict (.dict()). * The async token transform manager is now separated from a frame object, with the frame holding arguments, an on-demand expansion method and loop checks. * Only keys of template parameters are now expanded. Parser functions or template arguments trigger an expansion on-demand. This (unsurprisingly) makes a big performance difference with typical switch-heavy template systems. * Return values from async transforms are no longer used in favor of plain callbacks. This saves the complication of having to maintain two code paths. A trick in transformTokens still avoids the construction of unneeded TokenAccumulators. * The results of template expansions are no longer buffered. * 301 parser tests are passing Known issues: * Cosmetic cleanup remains to do * Some parser functions do not support async expansions yet, and need to be modified. Change-Id: I1a7690baffbe8141cadf67270904a1b2e1df879a	2012-04-25 16:51:36 +02:00
Demon	28e44b1d0f	Merge "Add --wikidom flag to parse.js"	2012-04-25 14:18:59 +00:00
Catrope	47969e20a1	Add --wikidom flag to parse.js Also remove unused import of DOMConverter Change-Id: I1eabe6bf9935970c1f049681b52e867a510ea77a	2012-04-23 15:01:12 -07:00
Gabriel Wicke	e2ca8c24c7	Delay some token duplication until actual mutation happens This is a bit better than cloning tokens wholesale, but not by much. There is a lot of potential for much better per-token caching with reduced token cloning. Need to map out all dependencies besides token attributes expanded from template parameters or other scoped state. Even if tokens themselves don't need transformation, they might still need to be considered for other token transformers, so simply keeping the final rank won't quite work even if the token itself is fully transformed. As a minimum, a shallow clone would need to be made and the rank reset (as in env.cloneTokens). Change-Id: I4329113bb21750bae9a635229ed1b08da75dc614	2012-04-18 17:53:04 +02:00
Gabriel Wicke	bf84638bc0	Add tokenizer cache and clone token state on mutation * Added an LRU cache (using the lru-cache node module) for tokenizer output * Mutation of nested attributes now replaces the containers. A shallow copy of tokens is sufficient to isolate token transformations. Need to investigate if we can actually get away without isolation and re-transformation for most ordinary tokens. Change-Id: I9136b1d7a1fbcc538183a319d4ecaa290d616fdf	2012-04-18 14:40:47 +02:00
Gabriel Wicke	aaca5eac7d	More tweaks: safesubst and image options * Ignore safesubst for now * Remove an unneeded whitelist entry * Make sure the caption is not lost for thumbs (fix to last commit) and remove debug print Change-Id: I243584ed0838cf7c3b4110fe9cdf869272477312	2012-04-17 11:02:52 +02:00
Gabriel Wicke	7fe5a86b60	Improve image option handling Change-Id: If1376766f41ff1288bfe2af19beecd3299c09a01	2012-04-17 10:46:20 +02:00
Gabriel Wicke	afa5b95bc1	Don't work around html5 library tokenizer attribute reordering The HTML5 parser we are using to normalize expected HTML output in parserTests reverses the order of attributes (see https://github.com/aredridel/html5/pull/53 for the fix). Remove whitelist entries concerned with this and use the proper order in external image attributes. Change-Id: If1868cae05396a150757c85a20473ab756cbcd97	2012-04-16 17:09:06 +02:00
Gabriel Wicke	c688b039de	Collected tweaks * less verbose logging in noinclude processing and template expansion * Give priority to the processing of templates transcluded from transclusions to get closer to depth-first processing. This serves to minimize memory usage from queued-up tokens. * Increase the maximum outstanding requests per template retrieval. 10000 amazingly proved too low a limit on some big pages. * Only process a single template request callback at a time for now * Add a debug print in the treebuilder wrapper * Don't treat multiple comments on a single line as a single comment to match the PHP parser's behavior Change-Id: I9a86b6d7bec3b9e1f17415daf1bf74170240721a	2012-04-16 15:47:03 +02:00
Gabriel Wicke	1bf8a9e5e1	Small tweak in comment about onlyinclude forcing buffered expansion Change-Id: Ib324e24c51c97e07e6737bf23f16db07043b69ab	2012-04-16 15:42:29 +02:00
Gabriel Wicke	efd4c026ea	Disallow < and > in external link urls Change-Id: Id865c3d46b33b182bb5b244e77e815c0afd7fa49	2012-04-16 15:36:56 +02:00
Gabriel Wicke	25523f4cf0	Implement urlencode parser function Change-Id: I4fca3134c9c3eb9a7d6f3360be6de054fb47477c	2012-04-16 14:54:03 +02:00
Gabriel Wicke	421ef44621	Match the empty string as whitespace too Change-Id: I1a8ed882021804f62855b9db4368270feebbfc16	2012-04-16 14:48:39 +02:00
Gabriel Wicke	08453199df	Increase number of callbacks per reactor iteration to 4 In experiments this dropped the memory consumption further, and reduces the queuing overhead in the node reactor. Change-Id: I9409b6ca863b43b7557663bbec9572365059c078	2012-04-13 14:50:36 +02:00
Gabriel Wicke	06ae53fdfe	Drastically reduce memory usage for template-heavy pages Only call back a few callbacks per reactor iteration from the template fetch request queue. This changes the expansion pattern from a (memory intensive) breadth-first expansion to something quite close to depth-first expansion. Additionally, retrieved pages are quickly added to the page cache so that a lot of request queuing is avoided in favor of synchronous expansion from the cache. On pages like Barack Obama that previously ran out of memory after consuming node's 1.6G heap limit, expansion now runs in relatively constant 100-300M resident (so far, still running). Change-Id: Ie34a1eeff00d868416de45ef8d289898258f560c	2012-04-13 14:31:03 +02:00
Gabriel Wicke	df050e4481	Convert external link syntax stops to stack Eat unbalanced external link parts within template parameters. This does not produce the same output as the PHP parser (try echo '{{YouTube}}' \| node parse.js), but preserves a level of sanity. Need to check how common this is for external links. If it is rare enough, moving the ']' after the parser function manually would fix the rendering for the YouTube case. Change-Id: I597d808efff36baa22191e7946a0061cc31120e8	2012-04-13 11:08:42 +02:00
Gabriel Wicke	5bb2d96869	Token stream transform improvements * add past paths for empty arguments etc * cache attribute token transform pipelines * fix bugs in TokenCollector and NoIncludeOnly handler, and improve its efficiency by only registering for 'end' tokens on demand * Remove empty reset methods from a few handlers * Add a simple 'ap' debug print function that makes it easy to only print some debug prints by temporarily changing 'dp' to 'ap' * Improvements and bug fixes in AttributeExpander Change-Id: Ie69729c8f62d48bba922712e44ebce484c621c50	2012-04-12 15:42:09 +02:00
Gabriel Wicke	3124deca2c	Track inclusion status on CachedTokenPipeline Non-include attribute pipelines are not cached for now. Adding separate caching for non-include attribute pipelines is very likely worth it, but deferred for now. Change-Id: I13f949d9f0a04536f9ccfcb73a2be69c5c08be01	2012-04-12 10:21:50 +02:00
Gabriel Wicke	efa41370d3	Set inclusion flag for attribute transform managers too Change-Id: Ice15d8fde6de4a3e850a028db9917e976218fc43	2012-04-11 21:55:52 +02:00
Gabriel Wicke	bff43938f6	Support noinclude/includeonly/onlyinclude in attributes Fun test case: {\| \|-<includeonly> foo </includeonly> \|Hello \|} Change-Id: I353bb287d3967ade549fbcb4ae64511a1f1f7e36	2012-04-11 17:37:25 +02:00
Gabriel Wicke	9ae572cca0	Fixes to template expansion / token transform managers, 296 tests passing. * Convert isNoInclude logic to positive isInclude throughout and set it properly on attribute pipelines. Also don't cache non-include pipelines. * Add a --pagename parameter to parse.js, which sets the page name in the environment. This is then returned by {{PAGENAME}}. Not the final solution, but useful for taxobox testing as taxons are selected based on PAGENAME. * Add rudimentary pagenamebase parser function Change-Id: If9c0be4c255200d0f2a30f02e5619437b4fd8f12	2012-04-11 16:34:27 +02:00
Gabriel Wicke	bbae66cd69	Nominate more HTML5 sectioning and heading elements for block-level treatment Block-level (in HTML4 lingo) elements are not wrapped into paragraphs. Change-Id: I4a01c9721be30b526172952915d528dea79e2f30	2012-04-11 12:53:49 +02:00
Gabriel Wicke	5a33099875	Improve template tokenization in template arguments Taxobox tables now render pretty much correctly. Change-Id: I5a0564138ff0c688d8a5a69b7867646fd3763946	2012-04-10 16:40:49 +02:00
Gabriel Wicke	577ef1f916	Add some support for alignment of thumbs Change-Id: I70570f48423628f7a87a35647698a66a5f413088	2012-04-10 12:11:59 +02:00
Gabriel Wicke	403be4af42	Add basic thumb rendering support * DOM based on Wikia's thumb output: HTML5, clean caption without magnify icon. * basic RDFa annotations, but most options additionally in data-mw object- might want to move more (or all?) of those into RDFa data using meta tags. * no support yet for framed or other formats, image scaling etc * also tweaked some config options in the environment Change-Id: Ie461fcdce060cfc2dec65cc057709ae650ef3368	2012-04-09 23:04:26 +02:00
Gabriel Wicke	dbdd320348	Improve parameter tokenization support especially for table rows Change-Id: I961d69e228b96adc69ea9acb3733d13f5898602d	2012-04-05 16:00:26 +02:00
Gabriel Wicke	7a35e5db16	Remove behaviors var in tokenizer, now handled in token handler Change-Id: I68eeff3f05ce29c13e347c2cd7ea6519e58b0e03	2012-04-04 21:17:29 +02:00
GWicke	da60861be8	Merge ""magic words" are tokenized and used to set parser.environment flags"	2012-04-04 19:11:03 +00:00
Adam Wight	a85ed36efa	"magic words" are tokenized and used to set parser.environment flags behavior switches are converted to tokens which set parser.environment flags during the async transformation stage. The next step would be for handlers in the sync23 stage to generate the TOC, section edit links, and so on according to these directives. No tests written, because the switches are consumed and don't appear in rendered html. We can test the magic word layout controls individually, once they're implemented. Another small change was to store option flags directly in the environment object, not that it makes much difference. Change-Id: I863fbf4be1a17d2f6c31158298dd301f19ae1137	2012-04-04 11:25:29 -07:00
Adam Wight	b234edba88	As much as I have loved writing Makefiles... I've replaced its functionality with package.json, mostly so we can avoid non-node dependencies. This is one of the recommended practices. We should consider moving tests/parser into modules/parser/tests, other node projects keep all module code in one directory. Explained in the README how to use npm to load the dependencies and run tests. Too bad about NODE_PATH... Don't try to find parserTests.txt in assorted places--if it isn't present, fetch from gerrit. You can symlink from core if you're developing on both parsers, and the fetch script will not overwrite. Use __dirname in parserTests.js to allow the script to run independent of current working directory. Change-Id: I4c8b884e91f4fdeae385c7697aff768bdd199dd5	2012-04-04 11:02:58 -07:00
Gabriel Wicke	e3a745a024	Improvements for template / -argument precedence; support for empty params Change-Id: Id0894ccbedfa47fa3658817ca65119a2af76be3e	2012-04-04 16:29:47 +02:00
Gabriel Wicke	2037215185	Disallow '[' in generic attribute names This avoids interpreting something like ! [[foo\|bar]] as <th [[foo=''>bar]]</th>. Change-Id: If59708fa90eb0117a15b2b6446890d1ae19a857c	2012-04-04 14:31:11 +02:00
Gabriel Wicke	f588d2a7aa	Fix table headings in template parameters Change-Id: Icdfc5655968fc845230ad7638124309d6b8c1ada	2012-04-04 12:54:34 +02:00
Gabriel Wicke	b8d980a229	Don't eat newline / space in template parameters ..so that block_lines can match. Change-Id: I4c464dc44249f40e4aa280df35fb726bfce3a745	2012-04-04 11:22:31 +02:00
Trevor Parscal	606d97da99	Merge "Add HTML DOM -> linear model converter"	2012-04-03 17:52:55 +00:00
Gabriel Wicke	47de122a95	Improve support for table / template interaction Match pairs of {{!}} or \| for template productions, but not a mix of the two. Example: {{#if:1\|{{!}}- {{!}} {{#if:1\|style="color: red"{{!}}\|}} }} Note that the style parameter ends up as the key of an empty-valued attribute on the table cell currently. Change-Id: I5f9357dd1645ef97b0af89f32e8d92ae49218c72	2012-04-03 18:48:35 +02:00
Gabriel Wicke	0fe062fbe1	JSHint cleanups and parser function argument handling improvements Parser functions which only accept positional arguments now return both the key and value of arguments. Complete attributes (key and value) for templates and the like from parser functions are not yet supported though. Change-Id: I3f81bb35acd27186222ce6d5217e820042527c01	2012-04-03 18:10:48 +02:00
GWicke	b7db83e09a	Merge "Magic links and behavior switch tokenization by Ori Livneh"	2012-04-02 16:43:13 +00:00
Gabriel Wicke	f662690d02	Shorten data-mw-rt to data-mw and clean up whitelist Instead of a proliferation of data-mw-* attributes, it should be easier to stash all private / non-semantic round-trip information in a JSON object stored in data-mw. Change-Id: Id200a6a8789fa152f29ea530e5a24b6ee7b4b285	2012-04-02 18:12:49 +02:00
Gabriel Wicke	5248fd31e8	Magic links and behavior switch tokenization by Ori Livneh Commit first patch by Ori, lets 288 parser tests pass. Yay! Change-Id: Iac8c3d1ad1984900350b20f7e725c40618a1e8ba	2012-04-02 17:31:34 +02:00
Catrope	8dc994f037	Add HTML DOM -> linear model converter Also, in ParserPipeline: * Import the LM converter and expose it through getLinearModel() * Fix getWikiDom() to actually work (still unused) In parse.js: * Add --help option that prints usage information (was unreachable) * Add --linearmodel option to output linear model JSON instead of HTML Change-Id: Ic534e03ff40a7c9117bb63f0c635a4213d5e3406	2012-03-29 12:47:14 -07:00
Gabriel Wicke	5ef2074251	Enable support for block-level wiki constructs in template arguments. This gets a bit closer to supporting table fragments passed through template arguments. Next, we'll need a way to indicate start-of-line position to enable sol block-levels in template parameters. Example: {\| {{#if: true\|{{!}}Table cell\|}} \|}	2012-03-15 11:43:49 +00:00
Gabriel Wicke	7e22020398	Convert syntactical break flags for templates from counters to the stack variant to fix the precedence for {{!}} (break on these inside table content, but not in template options within tables).	2012-03-14 16:30:59 +00:00
Gabriel Wicke	77a61dd687	Improve support for {{!}}, and don't produce a pre for indented tables.	2012-03-14 10:58:11 +00:00
Gabriel Wicke	835914b2de	Support {{=}}.	2012-03-14 09:07:01 +00:00
Gabriel Wicke	2195c31abf	Move link types to data-mw-rt, and support some more template tokenization edge cases. For example, the PHP parser treats \| foo \| = bar \| as \| foo = bar \|, believe it or not ;)	2012-03-13 12:32:31 +00:00
Gabriel Wicke	4cd8b302ac	Improved template tokenization. The parser can now template-expand [[:en:Barack Obama]] without exceeding 1.7GB of memory (which is the node limit).	2012-03-12 17:31:45 +00:00
Gabriel Wicke	3c5fe2523c	Tolerate more newlines and spaces in templates, and support templates and comments in urls.	2012-03-12 14:31:06 +00:00
Gabriel Wicke	ae4ab7a39c	Refactor syntactic stops into an object and add a stack variant for option values.	2012-03-12 13:08:43 +00:00
Roan Kattouw	29f416937e	Fix some usages of splice.apply in the data model to use ve.batchedSplice(). Added FIXME comments for occurrences outside of DM	2012-03-10 00:31:28 +00:00
Gabriel Wicke	ffc9383096	Temporary fix for template tokenization, especially needed for [[Template:Cite core]].	2012-03-08 14:24:04 +00:00
Gabriel Wicke	39017dd769	Percent-encode spaces in URLs, so that they are recognized as valid URLs later on.	2012-03-08 11:53:15 +00:00
Gabriel Wicke	7518db8197	A few fixes to parser functions and template expansion. Trim whitespace off template arguments, let the last duplicate key win and fake pagenamee slightly better.	2012-03-08 11:44:37 +00:00
Gabriel Wicke	51023feaa4	Improvements for image option handling.	2012-03-08 10:03:22 +00:00
Gabriel Wicke	b1e131d568	A bit more documentation and naming cleanup in the tokenizer wrapper.	2012-03-08 09:00:45 +00:00
Gabriel Wicke	f02ff95aa3	Token representation clean-up. Now all tokens are differentiated using constructors instead of type attributes.	2012-03-07 20:06:54 +00:00
Gabriel Wicke	f157093a41	Delegate responsibility for resetting the token rank to transforms, if full re-processing in a phase is wanted. By default, after a token type change or the return of multiple tokens only the remaining transforms with higher ranks are applied. Updated a few comments as well.	2012-03-07 19:29:53 +00:00
Gabriel Wicke	1f8c43b9e2	A few minor documentation updates.	2012-03-07 18:42:26 +00:00
Gabriel Wicke	5f618103d7	Set allTokensProcessed flag for async callbacks from the template expander.	2012-03-07 17:36:33 +00:00
Gabriel Wicke	e5a1116817	Start re-transformation as soon as possible in TokenAccumulator._returnTokens to maximize IO concurrency. Signal that all tokens are fully transformed to callbacks called from TokenAccumulator._returnTokens. The result should be a single re-transformation when entering the callback chain, and only if the transform does not signal that it took care of full transformation itself. Template expansion would set this flag, as the nested transform pipeline processes all tokens to the end of phase async12.	2012-03-07 16:29:06 +00:00
Gabriel Wicke	656524dbbc	Fixes for multi-transformer expansion in AsyncTransformManager. Added argument to callback which lets transforms indicate if their returned tokens are fully processed for their phase. If not, the callback re-processes them so that any remaining transforms are applied.	2012-03-07 15:39:18 +00:00
Gabriel Wicke	af03eb4f29	Improve generic attribute expansion before external link processing, and make wgUploadPath configurable. Also change the hard-coded fall-back image sizes to sensible defaults. This breaks three parser tests until image size retrieval from the wiki is implemented.	2012-03-06 18:02:35 +00:00
Gabriel Wicke	227103e12c	Accept empty table cell attribute sections, and consider percent-encoded %2525 valid. 270 tests passing.	2012-03-06 14:32:45 +00:00
Gabriel Wicke	2efcd3cd57	Reworked percent encoding handling for URIs to get closer to the 'url construction' part of the HTML5 spec: http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#url-manipulation-and-creation Removed a few whitelisted test cases that are now passing directly. The encoding canonicalization could also be moved to the Sanitizer. Doing this early in token stream processing however has the advantage of providing further transformations uniform data to work with. We could even consider to move this even further into the tokenizer.	2012-03-06 13:49:37 +00:00
Gabriel Wicke	19fe9726a2	Fix invalid external link representation. 268 tests passing.	2012-03-05 18:06:29 +00:00
Gabriel Wicke	a9ebc1d986	Support external images wrapped in a clickable link using bracketed external link syntax. 265 tests passing.	2012-03-05 16:23:00 +00:00
Gabriel Wicke	7f7202e89c	A few improvements to external link and image handling. 264 tests passing.	2012-03-05 15:34:27 +00:00
Gabriel Wicke	7b0c807710	Change wikilink tokenization strategy to split on pipes. This makes it possible to support template / template argument expansion in image options, and causes little trouble for wikilinks. Non-image wikilinks with multiple text pipes are quite rare in the dumps, and concatenating description tokens with a plain '\|' is quite easy. 261 parser tests passing.	2012-03-05 12:00:38 +00:00
Gabriel Wicke	3e6f1b6bea	Use some options primitively.	2012-03-02 14:19:33 +00:00
Gabriel Wicke	167dbdb0fa	Parse image options.	2012-03-02 13:36:37 +00:00
Gabriel Wicke	8b7ba9051b	Add productions for image option tokenization, and prepare to call those from the LinkHandler token stream transformer.	2012-03-01 18:07:20 +00:00
Gabriel Wicke	b1a7119a46	Hack up some rudimentary image rendering. Using jshashes for the md5, and a few hard-coded image image sizes ;) 262 tests passing.	2012-03-01 13:51:53 +00:00
Gabriel Wicke	d4faf9eaf4	More work on wiki link rendering and general wiki title / namespace functionality.	2012-03-01 12:47:05 +00:00
Gabriel Wicke	4b9bd45b82	Start to move wikilink expansion to a separate async token transformer.	2012-02-29 13:56:29 +00:00
Gabriel Wicke	b8bb503199	Actually commit onlyinclude, as already announced in r112592.	2012-02-28 13:24:35 +00:00
Gabriel Wicke	3227903d48	Follow-up to r112116, accidentally committed from subdirectory.	2012-02-22 16:41:01 +00:00
Gabriel Wicke	3568dfee14	Add some support for functionhooks in test parser and parserTests.js, and tweak a few parser functions.	2012-02-22 15:59:11 +00:00
Gabriel Wicke	d7da324272	Basic fall-through support for #switch parser function	2012-02-22 14:57:50 +00:00
Gabriel Wicke	491ad5ffef	Cleanup and commenting.	2012-02-22 13:13:18 +00:00
Gabriel Wicke	9b3313d923	Speed up flatten slightly by avoiding garbage for already flat arrays. Also, use simple string concatenation instead of arrays as the strings tend to be few and short.	2012-02-22 11:25:44 +00:00
Gabriel Wicke	8dde1f77b4	Reduce debug print overhead, roughly a 10% speed-up on parserTests.	2012-02-21 18:49:43 +00:00
Gabriel Wicke	058c4213a4	Remove some more unused code and tidy up some more.	2012-02-21 18:26:40 +00:00
Gabriel Wicke	416126c041	Fix the bug in the inline_breaks replacement, and write another switch-based version, which is slightly faster and shorter. Performance is improved by about 5% for parserTests.	2012-02-21 17:57:30 +00:00
Gabriel Wicke	18a04f7581	Tidy up and comment the tokenizer a bit more. Start to move code into mediawiki.tokenizer.js module, and pass a reference to parse(). Faster inline_breaks production using a JS function which seems to be generally correct, but still breaks five tests when enabled. Seems to be some weird interaction with peg.js, possibly something to do with caching.	2012-02-21 17:21:42 +00:00
Gabriel Wicke	8718bd65bc	Add list of HTML5 and deprecated HTML3/4 elements in preparation for end-of-potential-extension rules; Support indented tag-wrapped pre blocks.	2012-02-21 14:44:56 +00:00
Gabriel Wicke	ffec77273a	Comment and minor code tweaks.	2012-02-21 11:24:20 +00:00
au	ea15bffb27	Revert "* Always sort attributes (+1 test pass)." This reverts commit 45ca281da8eef8030bdd1986418cb914fc9a717c.	2012-02-20 22:26:12 +00:00
Gabriel Wicke	5806705733	Push transformer setup a bit further into the attribute pipeline.	2012-02-20 12:56:00 +00:00
Gabriel Wicke	8eddb4ec6b	Add some comments to the Sanitizer	2012-02-20 11:14:53 +00:00
Gabriel Wicke	71e95bd54b	Set up token stream transformers from a map of phases per input content type. Not yet applied to attribute pipeline creation. 249 tests passing.	2012-02-20 11:07:21 +00:00
au	9c55f5e8b7	* Always sort attributes (+1 test pass). The performance impact for .sort is quite small (12.079s => 12.158s) and Sanitizer is probably one of the more accessible places to do this.	2012-02-18 21:01:07 +00:00

1 2 3 4 5 ...

348 commits