unicodejs.graphemebreak.js
* New file: singleton class with splitClusters method
* On load, builds graphemeBreakRegexp from unicodejs.graphemebreakproperties.js
unicodejs.js
* Remove old splitClusters method (was just a placeholder)
* Change "conjunction" -> "disjunction", for consistency and correctness
unicodejs.textstring.js
* Use new splitClusters method
modules/ve/ve.js
* Use new splitClusters method
unicodejs.wordbreak.text.js
* Add new splitClusters test
* Refactor charRangeArrayRegexp test to use splitClusters
PHP files
* add unicodejs.graphemebreak.js, unicodejs.graphemebreakproperties.js
.docs/categories.json
* add unicodeJS.wordbreak class
Change-Id: I8f512e2fc2c46eb4b5f00994a8dac88f3c8f7dd2
unicodejs.js:
* charRangeArrayRegexp to write surrogate-aware regexps
* private helper functions
unicodejs.wordbreak.test.js:
* test charRangeArrayRegexp
* corrected tests for non-BMP wordbreaks
unicodejs.wordbreak.js:
* use new surrogate-aware regexps
unicodejs.wordbreakproperties.js:
* generated from Unicode data
unicodejs.graphemebreakproperties.js:
* generated from Unicode data
unicodejs.wordbreak.groups.js:
* delete as no longer used
unicodejs-properties.py:
* generate unicodejs.wordbreakproperties.js from Unicode data
* generate unicodejs.graphemebreakproperties.js from Unicode data
index.php:
* update script tag links
/VisualEditor.php:
* update script tag links
/demos/ve/index.php:
* update script tag links
/maintenance/makeStaticLoader.php:
* update script tag links
Change-Id: I39c0386a85b0cf21d68d3385b84018a5d7648de5
unicodejs.js:
* add splitClusters(text) and splitCharacters(text) methods
unicodejs.textstring.js:
* change internal representation from a char string to a list of grapheme
clusters
unicodejs.wordbreak.js:
* change getGroup to work on the first character of a grapheme cluster
ve.js:
* Use new unicodejs.splitClusters function
Bug: 48975
Change-Id: I202b98199d2780534d1e02519b72579ba796f08f
Initially just with a Wordbreak module to implement Unicode standard
on 'Default Word Boundaries'. Due to it's standaloneability this has
been written as a separate library. Non-BMP characters are currently
not supported.
Bug: 44085
Change-Id: Ieafa070076f4c36855684f6bc179667e28af2c25