normalization-data.lua is updated to Unicode 6.3.0.
upper.lua and lower.lua are updated to match HHVM 3.12.1's mb_strtoupper
and mb_strtolower. I don't know what version of Unicode that might be,
but it seems old.
Bug: T86096
Change-Id: I1a0c8be2756f86db5f36dd67319a1f79aea98b3e
An empty pattern isn't "safe" since it could match in between the
bytes of a UTF-8 character.
Also, it turns out there's a bug in PHP <5.6.9 preg_replace() that we
need to work around too.
Change-Id: I282e5909e4663461d60c5386693db182de2fd44c
Provides a simple wrapper for PHP's hash() and
hash_algos() functions.
I will add docs to the Lua reference manual once
this is merged.
Bug: T142585
Change-Id: I6697463974a175e99f9b77428a1085247165ebc9
For the PHP implementation, return the codepoints as a table instead of
multiple return values that get table-ified in Lua, to avoid hitting
too-many-values stack limits.
For the pure-Lua version, inline most of ustring.codepoint instead of
calling it to avoid what's effectively "{ unpack( stuff ) }".
Bug: T118687
Change-Id: I105f388cc23ab55d4124739700ef89d5354b7dbc
mw.ustring is really really slow. I've discovered that in a lot of modules
on enwiki, upwards of 2/3 of the total runtime gets used when mw.html
calls mw.ustring.gsub. This change checks whether any Unicode characters
are present, and if not, calls string.gsub instead.
Change-Id: Ia50061584be3901ae7428354c449236225c318db
Core strip markers were changed in T110143 to include characters that
are normally encoded in attributes, however we want to pass them through
here so they can be unstripped correctly in the output wikitext.
This fix makes "Strip markers in CSS" parser test pass again.
Bug: T110143
Bug: T135961
Change-Id: I1353931a53c668d8a453dfa2300a99f59fdb01c5
The following continue to be ignored:
* Generic.Arrays.DisallowLongArraySyntax.Found, because I'm not sure
Scribunto is ready to abandon old version support in master.
* MediaWiki.ControlStructures.AssignmentInControlStructures.AssignmentInControlStructures,
because it's overly strict for its purpose.
Squiz.Classes.ValidClassName.NotCamelCaps isn't ignored globally, we
just ignore it explicitly every place it's needed.
Change-Id: I307668da6ef7b3e23da19b1fd1e08914239b99b3
This also makes some updates to make-normalization-table.php to handle
the move of UtfNormal to a separate library.
Bug: T126427
Change-Id: Id4985c3ca441cf92f08ba1f1af85c762ba43d7d2
The new Scribunto_LuaTitleLibrary::redirectTarget() method is
used by mw.title objects as read-only attribute 'redirectTarget'.
If the page does not exist or it is not a redirect, the value
of the attribute is `false`; otherwise, it is the target of the
redirect page, as mw.title object.
This is a proper alternative to parsing wikitext as it is done in:
https://en.wikipedia.org/wiki/Module:Redirect
Bug: T68974
Change-Id: Id4d9b0f8c1cd09ebc42c031d4d3fc0c33eea44aa
Lua actually treats a close-bracket at the start of a bracketed
character class as a literal, rather than using it to close the
character class. Probably unintended behavior, but it happens.
Also, have the pure-lua version throw our more informative errors on
error even when falling back to string.find and the like, and fix some
other weird edge cases that came up in testing.
Bug: T95958
Bug: T115686
Change-Id: Iab783d4a3e58b1514cc09729d4a71c2cb1242ee8
A string with a dot pattern is only "simple" if
followed by +, - or *. The end of string condition was not checked
properly.
Change-Id: Ia10b9164caeabe464c76441cc82eef37a7013048
The result of the type function should be compared against the
string "table", not the global variable. This bug probably went
undetected until now, as "table" is also the global variable for the Lua
table library.
Change-Id: Ia28fa10388bfc587d95b522bfa8f3524b4a3ee5f
Add mw_loadData=true to metatables set by mw.loadData, so that modules can
distinguish them from other tables.
Change-Id: I0795d738891c85600af2621908376474ae21b3fe
Replace numeric loops with iteration, don't unnecessarily check for nil
before table.insert (since it's a no-op in that case anyway), and similar
restructuring.
Change-Id: I155839a648f242a1b1de35f4081d8bcfa34f6933
This makes the interwiki map available to Lua modules. The code is
based on the API interwiki map code in core (the appendInterwikiMap
method of includes/api/ApiQuerySiteInfo.php.) Everything that the
API includes is added, apart from iw_api and iw_wikiid, which I
couldn't think of a use for from Lua modules.
Accessing the interwiki map would be useful for modules like
enwiki's Module:InterwikiTable,[1] as it would stop module writers
having to duplicate the data.
[1] https://en.wikipedia.org/wiki/Module:InterwikiTable
Change-Id: Ie8ad2582aaf5e422824f7da51714a347bb4041d1
When os.date("*t") or ("!*t") is called, instead of just setting the TTL
to 1 second, create a metatable that sets TTLs as the values are looked
at.
Change-Id: Id1e2df731f182f21cf19708738f9907fa927185c
Currently, mw.title.new always results in a database query, which holds up
the parse until it finishes. This changes it to not require a database
query if it's not actually necessary.
Bug: T68328
Change-Id: I62f347d4cd9176bd0440215dcbe804c1dc3d4c99
Add more information to error messages in mw.html. This includes the
error level, the function name, and the position of the argument in the
argument list. Where possible, use the functions in libraryUtil.lua to
do this.
Some functions in mw.html accept multiple types, so add a checkTypeMulti
function to libraryUtil.lua to make these kinds of functions easy to check.
And while we're at it, add test cases for libraryUtil.lua as well.
Change-Id: If9cf9a52bd4b1bb42cc7f9f1f1096828710cbc52
Just like the other methods, e shouldn't be allowing passing of things
that aren't numbers or strings here.
For that matter, we should just abstract out the whole "arg key and
value validation" into a separate function instead of repeating it in
four places.
Bug: T76609
Change-Id: Id7e512a988ef9b7a5c5a110c8992dd5d649dcbf9
mw.text.unstrip is too broad, it's allowing for unstripping things that
cause problems when unstripped (e.g. bug 61268). Since the original
request was only for unstripping <nowiki>, let's add a function that
does only that.
We should also add an interface to StripState::killMarkers(), instead of
requiring everyone to roll their own work-alike.
Then, to fix the bug, we can make mw.text.unstrip be the combination of
the two. This is the most like the original behavior of mw.text.unstrip
(removes all strip markers, replacing them with text where applicable)
without causing issues.
Bug: 61268
Change-Id: I3a151fd678b365d629b71b4f1cb0d5d284b98555
Scribunto currently supports libraries with PHP callbacks that are
loaded on startup, and pure-Lua libraries that may be loaded from the
module with require().
This change allows for libraries with PHP callbacks to also be loaded
with require().
Change-Id: Ibdc1f4ef51b1c8644c3d4c98d57755b5c06447a5
It's not necessary, it makes the output bigger, and some pages have enough
elements with CSS that it does make an actual difference.
Change-Id: I80d471899c7e04a8a4876c205198a8c0d0b1f281
In Ia4d58f44, the code enabling __pairs to work no longer ran inside
MWServer.lua, so it hasn't worked right for serialization since then. This
restores the correct behavior.
Change-Id: Iea31ab363957f5f69838d6715527cf822c15fa94
Add a way to fetch cascading protection information from Lua without
needing to call the CASCADINGSOURCES parser function.
Change-Id: I1b3ac18af11d3066f78d27b31da8d6709a6a2631
The pure-Lua ustring pattern matching functions short-circuit to the
much faster string library when the pattern would match the same against
the raw bytes.
A pattern like "[^a-z]" can match a partial UTF-8 character when applied
bytewise, and so must be detected as unsafe.
Let's also directly test the pure-Lua module, instead of me having to
comment out lines in Scribunto_LuaUstringLibrary::register() whenever I
want to test them.
Change-Id: I91ed3374aadfea379b9db2e13b4248ab20df509e
Simplify the logic in mw.text.listToText so that we don't need to add or
remove anything from the original table we were passed.
Change-Id: I3efcbba1b9adc9a9e32e366e355cb742376cd91b
The pattern used by cssEncode is unnecessarily complicated. Simplify it by
using a negating pattern.
Change-Id: I5dc7169efea63473e9e23a1450d2941e434a00d8
Add an mw.dumpObject() method, which converts an object in the same manner
as mw.logObject(), but returns it instead of adding it to the log buffer.
Change-Id: Ie9fbd24d9d8d13ee2ddf8052679010892f61e1e0