Commit graph

3 commits

Author SHA1 Message Date
Brad Jorsch fcef54e9d9 Fix ustring errors
* mw.ustring.sub( '', 1 ) errors in LuaStandalone
* Default value for ustring.maxStringLength and ustring.maxPatternLength
  should be infinity, not nil
* mw.ustring.find() returns one value instead of two in "plain" mode.

Change-Id: I5e65c4ec3a05f0e6930ce7ab7fd4ac72bea95e7f
2013-03-05 12:22:15 -05:00
Brad Jorsch 4dcac2fcd9 Fix mw.ustring.gmatch and patterns with '^'
The Lua manual says this:

 For this function, a '^' at the start of a pattern does not work as an
 anchor, as this would prevent the iteration.

I had interpreted that to mean that a pattern starting with '^' would
never match in gmatch. But further testing reveals that the '^' is just
treated as a literal character: string.gmatch( "foo ^bar baz", "^%a+" )
will match "^bar".

Change-Id: Id91d6ee2db753ce1d6a4f6ae27764691d9e9fdc4
2013-02-14 14:25:55 -05:00
Brad Jorsch 0a8757baba Lua ustring implementation
This is a reimplementation of Lua's string library with support for
UTF-8.

The entire ustring library is implemented in pure Lua. PHP callbacks are
also available for overrides: in LuaSandbox these are used for almost
all functions, while in LuaStandalone they are used only for the pattern
matching. Also, ustring.upper and ustring.lower are overridden using
mw.language's .uc and .lc if available.

It also includes a bunch of unit tests.

Note that if you download the normalization tests, they may fail under
LuaSandbox if you have PHP's intl extension installed and libicu on your
system is too old.

Change-Id: Ie76fdf8d3a85d0a3d2a41b0d3b7afe433f247af0
2013-02-12 14:26:29 -05:00