wikimedia/mediawiki-extensions-VisualEditor

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/VisualEditor synced 2024-09-27 04:06:51 +00:00

Author	SHA1	Message	Date
Gabriel Wicke	68c5a6efc6	Collect tokens in a tokencollector and use cb for processing This is work in progress, but committed for now so I can use it for links and tweak it while doing so. Change-Id: I757277f6efacda6d9432ca57542a957f597a98de	2012-07-18 16:18:38 -07:00
Gabriel Wicke	681b0d4d40	Merge "Rename data-mw into data-rt"	2012-07-17 17:20:37 +00:00
Subramanya Sastry	80d74e1c16	Changed add/remove/get transforms. * This code change is an attempt to address the FIXME about constant resorting of transformations in _getTransforms. This caches sorted transformations and selectively clears/updates the cache on add/remove. Change-Id: If24a807b84d494aa4e5597339039a5573a30905e	2012-07-17 12:03:48 -05:00
Gabriel Wicke	3172afb750	Rename data-mw into data-rt This hopefully makes it clearer that data-rt contains private round-trip info instead of semantically interesting data. Change-Id: I03b476ed112a4b627c9871ee3677c450f943429a	2012-07-16 12:10:08 -07:00
GWicke	235739e253	Merge "Bug fix and minor code cleanup."	2012-07-13 22:40:20 +00:00
Subramanya Sastry	141ce901d2	Bug fix and minor code cleanup. Change-Id: Ic446c8822bf1b8a859e045119782d7b8a40c5544	2012-07-13 17:39:30 -05:00
Gabriel Wicke	1e902fc050	Merge "Encapsulate token collection"	2012-07-13 21:08:52 +00:00
Gabriel Wicke	e329455d55	Encapsulate token collection * Arbitrary predicate support for the termination of collection mode * tokens as property of the collector instead of a state-global thing Change-Id: Ibcb342bc64a76fece9b04a760ea56c7878e67cad	2012-07-13 13:57:04 -07:00
GWicke	97bc6cd5d7	Merge "Serializer fixes"	2012-07-13 20:36:36 +00:00
Subramanya Sastry	f4c6ba8545	Serializer fixes * Fixed image serializer to deal with missing 'v' value in a k-v pair representing an image attribute. * Added fix to deal with bare <li>'s (without surrounding <ul> tags) NOTE: The second fix is required currently to deal with bugs in the parser as it deals with complex cases. But, in the future, we could deal with this in one of the following ways: (a) The serializer expects a well-formed DOM and all cleanup will be done as part of external tools/passes. (b) The serializer supports a small set of exceptional cases and bare list items could be one of them (c) The serializer ought to handle any DOM that is thrown at it. Yet to be resolved. Change-Id: Ib585e5c9f2a8a80854740ce0211bde705f9fd6f4	2012-07-13 15:33:09 -05:00
GWicke	a742ec5ffc	Merge "Fixed parser and serializer to deal with a 4+ length dash sequence."	2012-07-13 20:14:34 +00:00
Subramanya Sastry	49ed0d3adf	Fixed parser and serializer to deal with a 4+ length dash sequence. Change-Id: If7caaefec1ad55e7604712ef959ff0c843392adf	2012-07-13 15:12:09 -05:00
Subramanya Sastry	e529ae7e0e	Serializer fix for empty headings (BUG-33089) Change-Id: Ia7b018335ac9e31938052473fc47ce38443fdeb4	2012-07-05 16:50:48 -05:00
GWicke	46d6502ca5	Merge "Fix for Bug 37913"	2012-06-30 08:56:48 +00:00
Gabriel Wicke	1736e52bfb	Abstract out chunk emission from tokenizer Patch by Adam Wight, fixes bug #35377. Change-Id: I183baeed8dd78e7d3c775f44d62bec8e6f9fc608	2012-06-30 10:39:12 +02:00
Subramanya Sastry	166e7a75c9	Fix for Bug 37913 * Strips the first paragraph tag in a list item or table cell context if there are no attributes on it and stx:html is not set Change-Id: I74988645fe505c662f86488e32d0f11d464ffe41	2012-06-29 23:47:59 -05:00
Gabriel Wicke	9ddc863d89	Up entity name length limit even further There are some really long names in http://www.w3.org/2003/entities/2007xml/unicode.xml Change-Id: I0138c9610bb288cd8f29e3600b8a21f932e7bcd9	2012-06-29 23:38:10 +02:00
Gabriel Wicke	cf7f437966	Match named entities with up to eight chars The longest entries in http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references. Change-Id: I2c9f102fe6a905e339e12520d08c1b1b0a4002d8	2012-06-29 23:15:30 +02:00
Gabriel Wicke	370fb607c8	Insert separation between adjacent pres Change-Id: I55aa649b4e076cae32b3c970d6384ab2ed4cdd6c	2012-06-29 23:05:06 +02:00
Gabriel Wicke	6c8dfa26fa	Escape ampersands in entities from plain text DOM content Change-Id: I0826077cf48b67e38a525090be66411c38d7b65f	2012-06-29 23:02:21 +02:00
Subramanya Sastry	5874d9a5f1	More thumb roundtripping fixes. * Looks like I misled myself in commit 88fc91 -- that wikitext roundtripped perfectly because it went through the 'src' route because it was a thumbnail with an explicit image which doesn't go through renderThumb -- so, the serializer simply spit out the original 'src' string and hence perfect rt :). * More whitespace preserving fixes in LinkHandler. * Also changed resource value in the img tag to use the original filename rather than the normalized capitalized filename. * 2 more parsertests rt -- now upto 400. Change-Id: I144a6486dd9d07da8a74a68700fe96c78d192826	2012-06-29 00:30:13 -05:00
Subramanya Sastry	ba6a304102	Prettified Wikitext Constants hash * Something to be said for code alignment - easier on the eye! * Maybe a good case for breaking mediawiki coding guidelines. * But, happy to abandon commit if not useful. :) Change-Id: I1133af488f572ac7f8727be9108e08e14c4e6420	2012-06-28 19:08:48 -05:00
Subramanya Sastry	88fc91a292	Next round of image roundtripping fixes * Changed PrefixImageOptions so that thumb and thumbnail are distinct key-value pairs. Without this fix, cannot distinguish between thumb=foo.jpg and thumbnail=foo.jpg * Fixed link handler so whitespace is preserved around prefixed image options. * Fixed figure handler to process the 3 different kind of image options: size, simple image options, and prefixed image options. * There is a hack/fixme for "upright: aspect" prefixed image option which needs to be looked into. * Still need to fix uppercasing of the image resource name. With these fixes, the following wikitext roundtrips perfectly (after newline breaks are removed) [[Image:Foo.jpg\|thumbnail = 'baby.jpg'\|100x100px\|center\| alt =bbbbb\| upright=true\|bottom\|link='http://foo.bar'\| This is a [[Linked Caption]] in the image]] Change-Id: I6606df56874c2b97f00f08cb6bbeaec9878167d3	2012-06-28 18:55:47 -05:00
Subramanya Sastry	11e7c1031a	Created a constants object for extracting wikitext markup properties. * For now, extracted image markup options out of the link handler. * This info will also be used by the serializer. * More properties/global constants can be moved into this structure over time. Change-Id: I4cfbfd703f42e93fbad52b38b435f68d8a5c22ee	2012-06-28 17:45:17 -05:00
Subramanya Sastry	d9d584e1b8	Minor tweaks/fixes to LinkHandler * Minor refactoring * Cleared src in dataAttribs in renderThumb since we can serialize thumbs now (or at least we can once all bugs are fixed and missing pieces are handled). Change-Id: If18865801cdd3d89c1477e68bfa3e13107c45b40	2012-06-28 13:14:52 -05:00
Gabriel Wicke	9f753d8009	Source-based round-tripping for behavior switches Change-Id: I46d12d338314a8dbfdc9a8448a74680e67c3a720	2012-06-28 18:20:13 +02:00
Gabriel Wicke	39b82fc3fa	Simple source-based round-tripping for category links Change-Id: I5a8a03e74a95c6dceda432f0356cce6a3af77c67	2012-06-28 18:12:19 +02:00
Gabriel Wicke	ff414ad825	Add generic source round-trip mode, and use it for plain images (for now) Anything with data-gen="both" and dataAttribs.src defined serializes to dataAttribs.src and drops its contents (if any). We can use this to round-trip elements we don't properly parse or serialize yet. Without RDFa info, the editor will not touch the contents after encountering data-gen="both". Change-Id: Ia39e5fdd765c2c9b36f26313455685d29f118839	2012-06-28 17:44:26 +02:00
Gabriel Wicke	8976b66558	Merge "Fix round-tripping of invalid external links somewhat"	2012-06-28 15:30:31 +00:00
Gabriel Wicke	e1a7d10063	Fix round-tripping of invalid external links somewhat * Don't consider them for auto-numbered links * Don't insert a trailing space if the content is empty These links are still wrapped in nowiki on round-tripping since the valid/invalid url determination is done in the LinkHandler and not the Tokenizer as it is configuration-dependent. Not incorrect for rendering (and perhaps easier to understand for humans too), but might still introduce a dirty diff. We'll still need reconciliation / damage tracking in the end ;) Change-Id: I959ebc1b7f81d110a1141bb38ba5ee97f52ebf96	2012-06-28 16:12:23 +02:00
Gabriel Wicke	4f94492f08	Merge changes I27bdc9c5,Ic09972bb * changes: Update nowiki handling to latest spec; some fixes to it Default to two preceding newlines for headings for better readabilty	2012-06-28 14:02:00 +00:00
Gabriel Wicke	4dcd88fc5f	Merge "Fix a crasher in unbalanced heading tokenization"	2012-06-28 13:57:29 +00:00
Gabriel Wicke	198e55a32b	Update nowiki handling to latest spec; some fixes to it 346 round-trip tests are passing now (up from 343). Change-Id: I27bdc9c5e010a13c2b4dddc6f263cbf9d3adac36	2012-06-28 14:57:05 +02:00
Gabriel Wicke	5b4cb03ee4	Default to two preceding newlines for headings for better readabilty This only applies to newly created headings, so headings with a single newline preceding them will be round-tripped that way. Change-Id: Ic09972bbd25c3934b53f6fd3b5be5a0c3185c2af	2012-06-28 12:42:19 +02:00
Gabriel Wicke	17af335748	Fix a crasher in unbalanced heading tokenization Example input: === foo == Old result: http://www.mediawiki.org/w/index.php?title=VisualEditor:Test&diff=prev&oldid=554403 Change-Id: I0bc135884833607cedb62ec9c045310df3649dd8	2012-06-28 12:34:32 +02:00
Subramanya Sastry	f995fc025a	First pass serializing image thumbs. * Collect all figure tokens and process them as a chunk * This effectively mimics context-sensitive DOM walking, but since we need serialization supported on a token stream, we cannot use real DOM walking. The current technique should also work on a token stream. * There is a FIXME about the image filename being capitalized. This needs fixing in the parser or some other way of recognizing original unnormalized filenam. Amended by gwicke: * Build option list and join it with pipe to avoid stray trailing pipe * Satisfy JSHint's weird preference to have '&&' and '\|\|' at the end of the line Change-Id: I1e5f6600f297fcdf81e3227a82ca3b71d4e97fc3	2012-06-28 11:29:10 +02:00
Gabriel Wicke	424a246b00	Rename data-mw-gc to data-gen. Credit to James! Change-Id: Iacbe20b355ddf5f12fffb71ff4dd978ac4364928	2012-06-27 19:08:14 +02:00
Gabriel Wicke	df26663a3f	Add basic tsr on indent-pre end tag This is a zero-length tsr for now (and thus not 100% correct), but will do the job for starttag / endtag range establishment Change-Id: Iedd50ad319aa8d5916434fb6744deb04e031e456	2012-06-27 18:08:49 +02:00
Gabriel Wicke	c02218c736	Merge changes Idfa5d6a8,I700142a5 * changes: Represent nowiki as span instead of meta Round-trip html entities and introduce data-mw-gc attribute	2012-06-27 16:07:48 +00:00
GWicke	d4eb4ce741	Merge "Code cleanup and more newline fixes."	2012-06-27 13:26:22 +00:00
Subramanya Sastry	4d2a46fb44	Code cleanup and more newline fixes. * Removed dead commented out code. * Cleaned up newline handling in serializer some more. * Now, onNewLine and onStartOfLine reflect serializer state more accurately. * No implicit new lines for explicit html tags. * 9 more roundtrip tests now green. Change-Id: I9f640de2ae769c7472538fa687400dc8a40c2b2d	2012-06-27 15:23:22 +02:00
Gabriel Wicke	a1d05976ce	Merge "Small (and incomplete) fix to table cell tsr"	2012-06-27 12:45:39 +00:00
Gabriel Wicke	53451bfc50	Small (and incomplete) fix to table cell tsr Change-Id: I14347939de32af698d7ce0b649165982908c49aa	2012-06-27 14:45:12 +02:00
Gabriel Wicke	7108ee985a	Represent nowiki as span instead of meta Change-Id: Idfa5d6a8ee7b2d17205779361ca69d075a79964d	2012-06-27 13:59:14 +02:00
Gabriel Wicke	0b9a420129	Round-trip html entities and introduce data-mw-gc attribute 297 round-trip tests are passing with this patch. TODO: * generalize data-mw-gc handling in the serializer for any tag * use data-mw-gc="both" and data-mw.src: 'the wikitext' for round-tripping of wikitext structures, optionally with some presentational (but read-only) content * use span and data-mw-gc="both" for nowiki Change-Id: I700142a56818977c20c8c06e6a5f2e77a708d25e	2012-06-27 12:52:52 +02:00
Subramanya Sastry	1a504a5f54	Added tokenizer support for ---- Change-Id: Idc5519350d11ae91b2ec64553f847d56e22d63bb	2012-06-25 16:40:34 -05:00
Subramanya Sastry	d5e6ec34aa	Deleted dead PEG productions Change-Id: I9b859f79f9900b3d320aa1ad0283a4b5ae6c4331	2012-06-25 13:17:01 -05:00
Gabriel Wicke	08b5ed1a43	Use _inNewlineContext method instead of bare onNewline This makes sure that we escape start-of-line syntax when needed, since onNewline is often not yet set. Discussion / background: [19:18] <subbu> this will fix it, but, i think this is asking for another minor refactoring of these flags ... because this is a subtle fix which means it might be possible to make it clearer. onNewline is one true in on direction, i.e. if true, we are in a new line state, but if we are in a newline context, onNewline is not true, which is why this new method is needed. [19:19] <subbu> i dont know if it is possible, but it seems like it shoudl be possible. but, something for later. [19:20] <subbu> badly phraed. "onNewline" ==> in new line context, but if in new line context, onNewline may be false. [19:20] <gwicke> we should perhaps update it as early as possible instead [19:21] <subbu> i cannot today, but possible monday. i am heading out in about 15-30 mins. [19:22] <gwicke> will need to check all conditions depending on it in _serializeToken [19:22] <subbu> oh, i misunderstood you :) [19:22] <gwicke> and if there are cases where the onNewline / onStartOfLine state could be reverted later [19:23] <subbu> you were referring to the flag, i thought you meant we should fix this sooner than later. [19:23] <gwicke> yes, I wasn't terribly clear [19:23] <gwicke> you wrote something about following productions swallowing newlines, but I think we don't actually do that any more [19:24] <gwicke> I'm quite optimistic that updating those flags much earlier would work [19:25] <subbu> yes, it could fix it. [19:26] <subbu> you might be right reg. swallowing. it was happening earlier. but, not right now, after single-line mode and other fixes. Change-Id: Ic1d8141c04eb54a59977d0ba87bcf06bafd421e0	2012-06-23 19:27:56 +02:00
Gabriel Wicke	d4dc8d86d9	Entity-escape [<>] in text content This should not really be needed if the tokenizer did not decode html entities on the fly. It is still a quick way to make sure no htmlish content can be inserted even with the current decoding. The next step and proper fix is to make entity decoding either optional in the tokenizer (flag-controlled), or move it to a later stage in the token processing pipeline. Change-Id: Ife093dcfb95113763dab5635b098c795d3550586	2012-06-23 17:06:10 +02:00
Subramanya Sastry	5f584909e1	Added documentation + minor code refactoring * Renamed defaultOptions to initialState * Got rid of unused state property * Added comments explaining how state attributes and tag handler flags are used * Refactored listItemHandler check into functions and added FIXME possible rewriting of that check. * Protected serializeDOM in a try-catch handler to catch exceptions and output the exception to the console. Change-Id: I3d351c06e4b86baeb5a55243b11dbfa9baca5bb7	2012-06-22 18:29:46 -05:00

1 2 3 4 5 ...

463 commits