wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-11-15 18:39:52 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	2fa5baabbb	Make it easier to configure the default wiki, and add support for mediawiki.org * mw:Foo now loads pages from mediawiki.org * The default prefix still is 'en'. You can switch this to 'mw' in ParserService.js. Change-Id: I1208667e6114bd711b7988a8b3adb32ffab70969	2012-06-07 11:50:40 +02:00
Subramanya Sastry	b665a2558f	Fixed bugs handing/transforming quotes - Three bugs that were messing up quote transformations. - Now, the following cases are handled properly: * ''foo''' * '''foo'' * ''foo'''' * ''''foo'' These tests (and other quote tests) have to be added to core parser tests file. - One more parser test green. Change-Id: I4f93e8910639f546bfc9304becab17d26d5529de	2012-06-07 01:37:45 -05:00
Translation updater bot	42daebe50a	Localisation updates from http://translatewiki.net . Change-Id: Ieb79571c97e1158414ecccbc8d5e984382f2cce5	2012-06-06 20:19:14 +00:00
Gabriel Wicke	413df0c471	Strip \r from form input- we normalize everything to Unix Change-Id: I5cd255e1a7ab9958f120fad408362e6f709e4b91	2012-06-06 19:26:29 +02:00
Gabriel Wicke	47204c4ca0	Use diffChars instead of diffWords, as the former misses some changes The improved merge algorithm now makes diffChars output more palatable. Things could still be improved by collecting single-character 'neutral' changes in a block of 'add' changes and converting them to adds / removes. Change-Id: I8439e8acab4360c08b89d9ce8a6b8523e7a0a210	2012-06-06 18:36:28 +02:00
Subramanya Sastry	f8221b128b	Used a more robust heuristic for merging consecutive diffs - Check if consecutive diffs are separate by 1 word in addition to max 3 chars. This takes care of diffs introduced by template diffs separated by the template name and creates a clean single diff. Change-Id: I9181d2ed9a07bee6ca5d5ebd6ddea84f7e2cecac	2012-06-06 11:01:47 -05:00
Gabriel Wicke	2bc066b42d	Up the diff merge size heuristic a bit and always use the same algorithm Change-Id: I707c8a55ed1758cdd591d2fc95e03a360c8e76d1	2012-06-06 17:46:25 +02:00
Gabriel Wicke	bc1a77a812	Make modified newlines visible by replacing empty lines with a space Change-Id: If7b811245e0d01a7a147ab54c3801fc1754730a9	2012-06-06 17:11:29 +02:00
Gabriel Wicke	1876d785a7	Swap ins/del in the diff Change-Id: Id336d713d1767a4b7859b158f2c2ddf9adc11cfb	2012-06-06 16:02:54 +02:00
Gabriel Wicke	350e700d8f	Add core-upgrade Change-Id: I5ad0955e8272d376f009f89461bed310978b25e4	2012-06-06 15:58:17 +02:00
Gabriel Wicke	d0a0454ada	Merge "Improve the handling of newlines for round-tripping"	2012-06-06 13:54:04 +00:00
Gabriel Wicke	aee35f627d	Merge "Update patched html5 library to version 0.3.8"	2012-06-06 13:53:37 +00:00
Gabriel Wicke	a146fcb8ad	Improve the handling of newlines for round-tripping An improvement, but there still are some extra newlines inserted after paragraphs. Example input: ------- Foo: {\| \|foo \|} ------- Extra newlines are inserted after the Foo: and the foo in the table. They are not fed as tokens or text to the tree builder, so there is likely a bug in the html5 library or JSDom. Change-Id: I83eb6180e3cd1c4e7f9b15b31d339e1d32bccd3f	2012-06-06 10:17:03 +02:00
Gabriel Wicke	59fc634cce	Update patched html5 library to version 0.3.8 Change-Id: I321d9a58ea1af33842a606fc8706938093a8330f	2012-06-06 10:17:03 +02:00
Subramanya Sastry	bff08b799e	Improvement to the refineDiffs function to improve diff quality. * Attempt to accumulate consecutive add-delete pairs with "short text" separating the pairs. This is equivalent to the <b><i> ... </i></b> minimization to expand range of <b> and <i> tags, except there is no optimal solution except as determined by heuristics ("short text": <= 2 chars). Change-Id: I408e318c315eba18aac4051ed84d77e3e092d497	2012-06-06 00:08:00 -05:00
Subramanya Sastry	fe6f289486	Merge changes I5d98c704,Ib8d3de75 * changes: A few tweaks to link round-tripping Use word diff if --color is enabled	2012-06-05 16:04:23 +00:00
Subramanya Sastry	b095db4303	Simpler implementation of flatten. * Possibly more efficient under heavy GC load -- untested. * No change in time and memory use for single file parsing. Change-Id: Id2f3f65cc0e5f38ed968bbda60b97e46523e700e	2012-06-05 10:47:46 -05:00
Gabriel Wicke	dc3168cf6d	A few tweaks to link round-tripping * Moved the tail attribute to the second attribute (a bit cleaner) * Disallowed newlines in the tail production * Improved the selection of round-tripped href vs. generated content vs. href in the serializer * renamed state.linkTail to state.dropTail Change-Id: I5d98c704b6ea566011e22237786f8da17548570f	2012-06-05 17:26:27 +02:00
Gabriel Wicke	0f9d939b00	Use word diff if --color is enabled Change-Id: Ib8d3de75ac306974abfdaca22bfc7b69bc62891d	2012-06-05 16:10:13 +02:00
Gabriel Wicke	d16032ae9a	Track html syntax in block_tag production Change-Id: If560523644f007485809762f12216e08fb3c3ed3	2012-06-05 12:39:56 +02:00
Gabriel Wicke	c1d8270bdb	Fix wgScriptPath in round-trip mode without interwiki Change-Id: I7cc80b7be1afffc586a2ea45d21303e9ba07c0d4	2012-06-05 12:11:45 +02:00
Gabriel Wicke	3346aed86e	Support interwiki links, and some cleanup Change-Id: I205c53a03f5230e3ef9100487f4934f97bdc179a	2012-06-05 12:05:33 +02:00
Gabriel Wicke	cc96ff4f5e	Very basic interwiki support Pages titles with a wikipedia interwiki prefix now load the page from corresponding Wikipedia. Links in a page then stay within the given language. Note that Parsoid currently makes no effort to recognize localized namespaces, so it won't render media files, categories etc correctly. Change-Id: I7bc4102e81a402772ea23231170734d580ea15b9	2012-06-05 11:19:58 +02:00
Gabriel Wicke	92f753a365	Pre and link target improvements * Don't explicitly add the newline in the pre, as we preserve newline tokens now. This avoids doubling of newlines when round-tripping. * Use the sHref attribute even if the href contains spaces. Change-Id: I8bec8fbfd6a7836bf2e5eec20869a0edd95c93b6	2012-06-04 14:03:05 +02:00
Gabriel Wicke	ee2ddbd3cb	Fix list handler issues Lists interrupted by non-empty lines would not close the list properly. Register for any token instead of just for newlines and close the list if no listItem follows the newline. Change-Id: I1743901e3db541bbeda78d17707db943e6ceb9b9	2012-06-04 13:38:43 +02:00
Gabriel Wicke	f821eac102	Optionally round-trip sHref in data-mw If the href would not denormalize, add a copy of the original href in data-mw and use it to preserve non-conventional capitalization etc. Change-Id: Ifef50eec7343b0e6b0ba66b6d19a8a3e8c9f8001	2012-06-04 12:28:05 +02:00
Gabriel Wicke	e0809209ec	Don't set the data-mw attribute if the object is actually empty. Change-Id: I984f1b44bba67d7a9f1a709738d14c0ee02f69a9	2012-06-04 12:26:03 +02:00
Gabriel Wicke	2774e5aa6c	Actually replace all underscores in wikilink target Change-Id: I633f8d6e4f639aff90fd456600376b7c6515fd50	2012-06-04 11:48:59 +02:00
Gabriel Wicke	3f2c72f920	Fix padleft / padright (mis)use as substr Change-Id: I0645e11c8ef8b550ad35300d1904788940fc748a	2012-06-04 11:30:45 +02:00
Gabriel Wicke	0eabd2c67e	Add round-trip form and split out rt diffing Change-Id: I3bc8ad7f273937ce6c767b8d7bbccdc86cbd93b4	2012-06-04 10:49:59 +02:00
Gabriel Wicke	99c98d6c56	Diff refinement fixes Change-Id: I11c69de0fdcd636ccd11cd0b6cb16c5acdb188b3	2012-06-04 10:16:05 +02:00
Gabriel Wicke	d2602c47a6	Switch back to word-based diff The char-based diff looked good in some pages, but yielded terrible results in others. The word-based algo is more consistent overall. Change-Id: I7f2d40315ad96df037c2d9a1d50739e3d21b6c81	2012-06-04 00:02:49 +02:00
Gabriel Wicke	4533c274ca	Fix a crasher in the serializer A tail containing regexp syntax (a ? in [[:en:Main Page]]) would crash the serializer. Use substr instead. Change-Id: I8519aec9c07dfe31893d676b1c936a42d2af74a0	2012-06-04 00:00:54 +02:00
Gabriel Wicke	d01581c380	Create a 'refinement diff' algorithm The word or char-based algorithm does not scale well beyond 5k chars or so. We now perform a line-based diff and then continue to diff the line differences using the char-based algorithm. This gives a char-based diff even for bigger inputs. Change-Id: Iec87ca56540060e4df2859ba54c992e7ff5cfe10	2012-06-03 23:46:57 +02:00
Gabriel Wicke	b11b8d8a6b	Revert to line diff, word diff explodes on some pages Change-Id: Ic338498b47bb6b6c98fa6280f44464cd70a48b1b	2012-06-03 11:39:03 +02:00
Gabriel Wicke	b5e067e086	Some more web service tweaks * Stay in round-trip mode in HTML DOM output * Return DOM, wikitext and diff as soon as they are available Change-Id: I7f8f44cfe8eed63a521d1318d116c22232cb6b1b	2012-06-03 11:04:40 +02:00
Gabriel Wicke	7c18891504	Snazzy html word diff for roundtrip view Also show the HTML DOM, Wikitext output and diff. Change-Id: Ibe744fbc895239f4e48f6e0e2f2b2f345c0845bd	2012-06-03 01:36:56 +02:00
Gabriel Wicke	4cf74497b7	Update web service start page documentation Change-Id: I38efc5a9d5b919c6168cf97d0efbae9db967e351	2012-06-02 17:17:37 +02:00
Gabriel Wicke	31522d3d49	Add ApiRequest Change-Id: I5f2a1cb65223a68f10bc63903000248efca05586	2012-06-02 16:52:51 +02:00
Gabriel Wicke	7c7ddd22a7	Retrieve content from the main namespace instead of templates Change-Id: Id917fa617d6fba1e1b290b2ed20c24aed24d39d2	2012-06-02 16:48:00 +02:00
Gabriel Wicke	63abd57fc8	Improve newline-before-paragraph round-tripping support Change-Id: I9176a97f9695018650d9a63b89514c07e0d6be90	2012-06-02 16:39:33 +02:00
Gabriel Wicke	d3975a8d03	Very basic round-trip test mode for the API Returns both the resulting wikitext and the diff with the original input. Change-Id: Iad25039beb054a84e1ad51ffa9fee924db49c60b	2012-06-02 16:20:54 +02:00
Gabriel Wicke	74135b295f	Some more switch fixes Change-Id: If1a6086348c45a73a941bc8e6728ef75d002be50	2012-06-02 15:04:20 +02:00
Subramanya Sastry	8f216af2f5	Handle link tails properly. - Added a tail json attribute for wikiLinks - During serialization, this attribute is used to strip the tail from the link target and render it after the link [[hen]]s ==> <a ... data-mw="{gc:1, tail: 's'}" ...>hens</a> ==> [[hen]]s - 2 more roundtrip tests green Change-Id: I84f3dabaf0271f7a67641a00148467daa8310eb0	2012-06-01 23:41:10 -05:00
Subramanya Sastry	413fc5e043	Fixed bug serializing wikilinks with implicit link text. * Simple fix but greens 10 more roundtrip tests. Change-Id: I7f82d788a10bd83e0e3215568c2168081c332c50	2012-06-01 17:25:21 -05:00
Gabriel Wicke	16219ddc6d	Fix up #switch a bit * Re-establish the value-only default * Fix value expansion Change-Id: I32e62789b25bbe17a74c564e41e9101ad5528fb7	2012-06-01 22:15:43 +02:00
Gabriel Wicke	e2301813ed	Merge "Tokenizer backtracking cache bug fix and memory savings"	2012-06-01 12:06:00 +00:00
GWicke	befd223476	Merge "First pass implementing a general tag minimization routine"	2012-06-01 11:15:48 +00:00
Gabriel Wicke	ece2b0f810	Tokenizer backtracking cache bug fix and memory savings * The state of syntax stops is now properly included in the cache key for the tokenizer-internal backtracking cache. This fixes some mis-parses when re-parsing a bit of text with different flags. * Clear the backtracking cache after each toplevelblock. This drops the peak memory usage when expanding [[:en:Barack Obama]] from ~380M to ~110M. Change-Id: Icdb879cae5907e4595903dd6acba2e686e8c2e4b	2012-06-01 12:53:49 +02:00
Subramanya Sastry	1c80e2d7f0	First pass implementing a general tag minimization routine * This routine attempts to rewrite the DOM to maximize tag overlap and thus minimize tag uses. * This takes as input a set of tags which participate in the minimization. * Tested on the following example <b><i><u><s>BIUS</s></u></i></b><b><i><s>BIS</s></i></b><b><u><s>BUS</s></u></b><u><i>UI</i></u> with multiple combinations of the 2^4 possible variations of i,b,u,s tags: [], ['i','b','u','s'], ['i'], ['b','s'], ['i','b','u'] - But, I am not fully sure if this implements the right behavior when only a subset of inline tags are provided. Needs discussion and tweaking as necessary. * Also tested on few others: <b>B</b><b><i>BI</i></b><b><i><u>BIU</u></i></b><b><i><u><s>BIUS</s></u></i></b> <s><i><b>SIB</s></i></b><s><i><u>SIU</u></i></s><i><u>IU</u></i><i>I</i> * The previous pairwise tag rewriting version fails on several of these examples, so this new version is a definite improvement. * No change in parserTests run (203 passing before and after). * Possible improvements that could/should be undertaken: - get rid of useless/idempotent add/remove of nodes that don't change the DOM. - ensure that node attributes post-restructuring are correct. Change-Id: Ib4a8b39583fa96a2be880a77021ca81cefa06484	2012-05-31 12:10:28 -05:00

1 2 3 4 5 ...

1312 commits