- Eliminated newline handling from several places in code and
mostly isolated it to serializeToken thus simplifying newline
handling logic.
- Fixing some bugs in the process: # of green roundtrip tests
went up by 5 (294 --> 299) but actually introduced failures on
a few originally succeeding tests (additional leading/trailing
newlines on the entire test output).
- Added bonus: made list serializing (mostly) insensitive to
newlines between tags. So, all the following DOM serialize
identically to the following wikitext:
*foo
*bar
----------
<ul><li>foo</li><li>bar</li></ul>
----------
<ul>
<li>foo</li>
<li>bar</li>
</ul>
----------
<ul>
<li>
foo
</li>
<li>
bar</li>
</ul>
----------
Change-Id: I76be56c4b2789039dff5f47de4659746882e45d6
Parsoid ignores sHref when converting back to wikitext, so we have to
set the href attribute to "/$title"
Change-Id: I1068116c0be72197619d0df3b4d1231a3879fa14
And made it not start on it's own, but be started by ve.Surface - this makes it so it's not polling in the unit tests, for instance
Change-Id: I940df04d392fd134d18847949efe0e2232328323
* As part of an earlier fix, I had changed default value of 'res'
to null instead of ''. But, this was potentially buggy because
the previous check was (res !== '') which could be triggered
by return values of handlers. By changing the check to null,
I was effectively changing the code paths for those handlers that
returned ''.
Change-Id: I2302023be7422ce4fb384ff5a50fe53fa7732855
Fix Inspector bug which prevented applying a link annotation to data
already containing annotations.
Change-Id: I6f315d50805c8c71f2155f955ea5674a7ce98656
This was causing some issues when you started typing there, and it's not clear that the white text is really the best way to go anyways
Change-Id: I8a9d6571ea204603729e96b7ff77184279a31a95
The offset map was broken from the start because it wasn't updated when
adjusting the length of a text node, and if we fix that bug it's
doubtful whether the costs of updating the offset map outweigh the
benefits, especially considering that adjusting the length of a text
node is something we do for almost every keypress. If it turns out
having an offset map does make sense, we can always reintroduce it
later.
Change-Id: I59e8bc154f7d07aa1bab2f473c13ff466d0e463f
paragraphs in lists.
* We need to look at other special-case handling requirements of
html tags in lists (and other contexts like tables).
Change-Id: I84b8402d90a186c9075c2d45263c94377312927a
Parsoid outputs rel="mw:wikiLink" or rel="mw:extLink", so we convert that to link/wikiLink and wiki/extLink respectively.
Also preserve the data-mw attribute; we probably need to do this more generally but this'll do for now.
Change-Id: I32e570bffa5a73a733a120d52cfd8b75d3191e02
Adding a 'style' attribute which is set to either 'data' or 'header'
This breaks even more tests because of missing style attributes
Change-Id: I0a75d8c1578b4414eeae8c484f6c4d6f8a59472a
When a text node was closed (either by encountering a non-text node or by reaching the end of the document), the fixupStack would not be processed. This led to attributes being dropped when nodes were split because of text being inserted into them
Change-Id: I41f6d20e0c1bfc8d8689b7e6325e724dd8156ab1
* Add code to handle elements and annotations
* Drop support for aliens from getDomElementFromDataElement() and move it into getDomFromData()
* Implement getDomElementFromDataAnnotation()
* Document a few functions
Change-Id: Ic6a418cbf9d7d1ad96299d7d3633970a876c6103
it does not include Alien nodes and Images - because we are not going to
support them for June release).
Change-Id: I229e4b5f2881714252699f23aef164655fa8bcf6
* Moved wikipedia default prefixes to environment
* Added 'addInterwiki' method
* Adjusted link handling normalizeTitle to reflect this
Change-Id: If5b2314cc36346b6da8649ed410457a612d80a22
* mw:Foo now loads pages from mediawiki.org
* The default prefix still is 'en'. You can switch this to 'mw' in ParserService.js.
Change-Id: I1208667e6114bd711b7988a8b3adb32ffab70969
- Three bugs that were messing up quote transformations.
- Now, the following cases are handled properly:
* ''foo'''
* '''foo''
* ''foo''''
* ''''foo''
These tests (and other quote tests) have to be added to core parser
tests file.
- One more parser test green.
Change-Id: I4f93e8910639f546bfc9304becab17d26d5529de
Created new document method to determine if a specific annotation
object is inside an annotation array.
Change-Id: Id645929cbf31030b8b0fcacb8dfb36e61aaad129
This fixes a bug where the second replace operation in a transaction
would cause the rebuild of the wrong range, or the adjustment of the
wrong text node.
Change-Id: I9b1c68d84999d538fe10bb193f4dfdd694121d2a
This is needed to make the results of certain transactions' tree sync
round-trip cleanly through the ve.dm.Document constructor
Change-Id: I2ab0758ec6bd7afba5b6645c7330f9fa2d45205d
new static method looks for annotation in annotation object.
ve2 Cleanup on annotate method and surface model.
Partially revive UI tools by exchaning old method usage
for ve2 methods.
Change-Id: Id0ac58330292d76801bbcf1d71a919b493f8ab9e
An improvement, but there still are some extra newlines inserted after
paragraphs. Example input:
-------
Foo:
{|
|foo
|}
-------
Extra newlines are inserted after the Foo: and the foo in the table. They are
not fed as tokens or text to the tree builder, so there is likely a bug in the
html5 library or JSDom.
Change-Id: I83eb6180e3cd1c4e7f9b15b31d339e1d32bccd3f
* Possibly more efficient under heavy GC load -- untested.
* No change in time and memory use for single file parsing.
Change-Id: Id2f3f65cc0e5f38ed968bbda60b97e46523e700e
* Moved the tail attribute to the second attribute (a bit cleaner)
* Disallowed newlines in the tail production
* Improved the selection of round-tripped href vs. generated content vs. href
in the serializer
* renamed state.linkTail to state.dropTail
Change-Id: I5d98c704b6ea566011e22237786f8da17548570f
Pages titles with a wikipedia interwiki prefix now load the page from
corresponding Wikipedia. Links in a page then stay within the given language.
Note that Parsoid currently makes no effort to recognize localized namespaces,
so it won't render media files, categories etc correctly.
Change-Id: I7bc4102e81a402772ea23231170734d580ea15b9
Functional changes (fixes):
* Make writeElement() also update parentNode and parentType for openings
* Also add to fixupStack when opening a wrapper for a text node
Non-functional changes (cleanup&docs):
* Document all variables at the beginning of the function
* Group variables according to where/how they're used
* Move expectedType into writeElement()
* Kill node, duplicates parentNode unnecessarily
* Kill paragraphOpened, was misnamed and unnecessary
* Rename closedElements to reopenElements
Change-Id: Ie5b4e4f30b267943048fdc170accb29139039192
* Push entire elements onto openingStack rather than type strings
* When closing an element, build a clone of the opening and push it onto
closedElements, then insert that clone when reopening the element
Change-Id: I8b0fb44394aed6c471dc6dacaab03e44c2333733
* Don't explicitly add the newline in the pre, as we preserve newline tokens
now. This avoids doubling of newlines when round-tripping.
* Use the sHref attribute even if the href contains spaces.
Change-Id: I8bec8fbfd6a7836bf2e5eec20869a0edd95c93b6
Lists interrupted by non-empty lines would not close the list properly.
Register for any token instead of just for newlines and close the list if no
listItem follows the newline.
Change-Id: I1743901e3db541bbeda78d17707db943e6ceb9b9
If the href would not denormalize, add a copy of the original href in data-mw
and use it to preserve non-conventional capitalization etc.
Change-Id: Ifef50eec7343b0e6b0ba66b6d19a8a3e8c9f8001
A tail containing regexp syntax (a ? in [[:en:Main Page]]) would crash the
serializer. Use substr instead.
Change-Id: I8519aec9c07dfe31893d676b1c936a42d2af74a0
- Added a tail json attribute for wikiLinks
- During serialization, this attribute is used to strip the tail from
the link target and render it after the link
[[hen]]s ==> <a ... data-mw="{gc:1, tail: 's'}" ...>hens</a>
==> [[hen]]s
- 2 more roundtrip tests green
Change-Id: I84f3dabaf0271f7a67641a00148467daa8310eb0
This allows us to check the watchlist checkbox on save dialog.
Added watchlist toggling to ve save api.
Added some i18n messages to core integration.
Change-Id: Ibed8edb2c59ad49e1738c937c3bea518238d0845
* The state of syntax stops is now properly included in the cache key for the
tokenizer-internal backtracking cache. This fixes some mis-parses when
re-parsing a bit of text with different flags.
* Clear the backtracking cache after each toplevelblock. This drops the peak
memory usage when expanding [[:en:Barack Obama]] from ~380M to ~110M.
Change-Id: Icdb879cae5907e4595903dd6acba2e686e8c2e4b
* Added converters to all relevant node implementations
* Added new annotation objects with their own factory
Change-Id: I9870d6d5eac45083929d74d2e58917d0939ca917
Also:
* Refactored tests
* Added tests for ve.dm.Transaction.newFromInsertion
* Added tests for ve.dm.Transaction.newFromRemoval
* Fixed problems with ve.dm.Transaction.newFromInsertion
* Added ve.dm.Node.canBeMergedWith which is partially a port of ve.Node.getCommonAncestorPaths merged with canMerge from within ve.dm.DocumentNode.prepareRemoval from the old ve codebase
Change-Id: Ibbc3887d08286d8ab33fd6296487802d65b319fa
* This routine attempts to rewrite the DOM to maximize tag overlap
and thus minimize tag uses.
* This takes as input a set of tags which participate in the
minimization.
* Tested on the following example
<b><i><u><s>BIUS</s></u></i></b><b><i><s>BIS</s></i></b><b><u><s>BUS</s></u></b><u><i>UI</i></u>
with multiple combinations of the 2^4 possible variations of i,b,u,s
tags: [], ['i','b','u','s'], ['i'], ['b','s'], ['i','b','u']
- But, I am not fully sure if this implements the right behavior when
only a subset of inline tags are provided. Needs discussion and tweaking
as necessary.
* Also tested on few others:
<b>B</b><b><i>BI</i></b><b><i><u>BIU</u></i></b><b><i><u><s>BIUS</s></u></i></b>
<s><i><b>SIB</s></i></b><s><i><u>SIU</u></i></s><i><u>IU</u></i><i>I</i>
* The previous pairwise tag rewriting version fails on several of these
examples, so this new version is a definite improvement.
* No change in parserTests run (203 passing before and after).
* Possible improvements that could/should be undertaken:
- get rid of useless/idempotent add/remove of nodes that don't change
the DOM.
- ensure that node attributes post-restructuring are correct.
Change-Id: Ib4a8b39583fa96a2be880a77021ca81cefa06484
Copy-pasting things like "text<IMAGE>moretext" failed spectacularly,
this commit fixes that.
* Check for content rather than structure in the inserted/removed data
* In the content case
** Run selectNodes() over the removal range, rather than just the cursor
*** i.e. no longer assume that content replacements only affect one node
** If there is structure involved, rebuild all affected nodes
Change-Id: I80e40b5b7c514a3fb105d57e4a17770d0fefaaea
Some of the replacement code was assuming that "does not contain
elements" and "is content" were the same. They're not any more, because
we have content nodes (like image) now, so I need a separate function
to distinguish between these cases.
Change-Id: I206ccdf082b7baddf99d382eb3cdd77ea34fb479
If the last element of the input data array was text, the resulting text
node would have length=0 rather than the expected length value.
Change-Id: I3d089a80b8a447a12ba411b2e11c1b84f14f2959
To allow non sysops to save via VE, refactored ve save api
to use doEdit which bypasses namespace protection.
Add edit link in view nav for non sysop so that they may edit
Add View source link in dropdown for non sysops
Add Edit source link in dropdown for sysops
Cleaned up some of the integration core code
UI tweaks
Change-Id: Ib4249bc5fb7ffa6410e4f2d278aafbb871800981
WARNING: This is not as fast as the implementation of getNodeFromOffset in dm
Change-Id: I5fbe9b6edc66169b9caaa6751fde1b7b752814d1
NOTE: ve.ce.getNodeFromOffset and ve.dm.getNodeFromOffset should be renamed to getBranchNodeFromOffset to clarify that they only return branch nodes.