Backporting this so the LTS release has forwards compatibility with
Wikipedia templates.
mw.loadData() allows for optimizing the loading Lua tables by requiring
only one parse and lookup. However it's often easier for people to
write/maintain bulk data in JSON rather than Lua tables.
mw.loadJsonData() has roughly the same characteristics as mw.loadData()
and it can be used on JSON content model pages in any namespace.
As noted on the linked bug report, it's possible to already implement
this by writing a wrapper Lua module that loads and parses the JSON
content. But that requires a dummy module for each JSON page, which is
just annoying and inconvenient.
Test cases are copied from the mw.loadData() ones, with a few omissions
for syntax not supported in JSON (e.g. NaN, infinity, etc.).
Bug: T217500
Change-Id: I1b35ad27a37b94064707bb8c9b7108c7078ed4d1
(cherry picked from commit 1000d322e5)
This is being backported because many users copy lua modules from
Wikipedia, and thus benefit from forwards-compatibility.
For the most part, it is a good idea to avoid global variables and use
`local` variables instead. Quoting from the ScopeTutorial[1], "The
general rule is to always use local variables, unless it's necessary for
every part of your program to be able to access the variable (which is
very rare)."
Wikimedia module authors have written "Module:No globals", which errors
on the use of any global variable. On the English Wikipedia, this is
used on 32% of pages (18 million). Wikidata[2] indicates that it's been
copied to 334 other wikis.
Lua itself distributes an extra named "strict.lua"[3], which is what
this is based off of. Similar to bit32.lua, this is a pure-Lua library
that can be imported/enabled with `require( "strict" )` at the top of a
module.
The two changes I made from Lua's strict is to exempt the `arg` key,
which is used internally by Scribunto, and remove `what()`, since we
don't enable access to `debug.getinfo()` for security reasons.
[1] https://lua-users.org/wiki/ScopeTutorial
[2] https://www.wikidata.org/wiki/Q16748603
[3] http://www.lua.org/extras/5.1/strict.lua
(Cherry-picked from 829c53ef05)
Bug: T209310
Change-Id: I46ee6f630ac6b26c68c31becd1f3b9d961bcab29
This reverts commit 62e1fb0b5f.
Reason for revert: caused several errors:
* unnamespaced HooksTest collides with core’s class of the same name
* Scribunto_LuaError renamed without class alias despite being used in Wikibase
Bug: T314464
Change-Id: I8b151327236bf86945e59823fba155497e4b3fc6
This reverts commit 602cef87e0.
Reason for revert: Production errors in 1.38.0-wmf.16
Bug: T298659
Change-Id: Ic6c0e31c8247f7d89824d20f28fb0aa56d6ed749
On the Wikimedia cluster, 1.6% of MediaWiki wall-clock time is burnt on
calls from Lua into Scribunto_LuaSandboxCallback::frameExists()[1]. We
can optimize away many of these calls by not calling into PHP to check
if 'empty' or 'current' exist: the engine always reports that the
'empty' frame exists, and 'current' is guaranteed to have been set up
(in LuaEngine::setupCurrentFrames) prior to calling into Lua.
To help validate this, I added debug logging to the current production
branch of Scribunto[2] to see if there are any cases where
Scribunto_LuaSandboxCallback::frameExists('current') is false. As I
write this commit message, the logging code has been active for 24H and
there have not been any occurrences.
[1]: https://performance.wikimedia.org/arclamp/svgs/daily/2021-03-16.excimer-wall.all.reversed.svgz
[2]: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Scribunto/+/672836
Change-Id: I1902b711c9a442a5a42745a582a6a9ff988a355f
Additional changes:
* Removed phan-taint-check-plugin from extra, now inherited from mediawiki-phan-config.
Change-Id: I83fff3a5ff566790bc051d7bfffe7f3b124d3de7
This triggers a needed reparse when a new page is created using a module
that accesses the page ID.
Bug: T237746
Change-Id: I5564c2e896dd2a025c5a886ca478c377fac83e74
This is getting close to the point of "don't do that, just wrap the
built-in". But since it's a regression in a recent patch, let's restore
the old behavior here.
Bug: T236092
Change-Id: Ieddc23d942bc91fd0246ae14d8a4af7719e3834f
When an #invoke is passed as an argument to another #invoke,
mw.getCurrentFrame() at module scope will return the wrong frame.
On the PHP side, we need to always reset the frame when processing
an #invoke, not just when there's no frame already. I don't remember why
I82dde43e wasn't done that way, but changing it doesn't make any tests
fail and Scribunto tends to have good tests.
On the Lua side, we need to do the same. The logic wih mw.getCurrentFrame()
using a global that gets stored, modified, and reset in several places
was getting confusing, so this patch reworks the logic to inject a
globalless mw.getCurrentFrame() into each #invoke's cloned environment
instead.
Bug: T234368
Change-Id: I8cb5bc4dc14c9b448c9f267e0539daa75e72af4c
Ideally we'd just have composer.json require UtfNormal so we'd know
where it is and have an autoloader to load it for us, but that seems to
not be done in the world of MediaWiki extensions.
Previously we had been taking paths to the two data files from UtfNormal
and loading them into a stub class, but phan has started complaining
about the definition of the stub class colliding with the real UtfNormal.
So let's try loading the real UtfNormal\Validator and its data files.
Hopefully this continues to not try to pull in any other files via the
nonexistent autoloader.
Change-Id: I93baf20f0eef1892685e272793b4f99236e8c905
RFC 3986 allows IPv6 literals (and future IP versions) by having the
"host" enclosed in brackets, like `http://[2001:db8::]`. mw.uri should
handle these appropriately.
Bug: T223267
Change-Id: I6f712b87bc376cf606c6c2ebbe80176037d6dddb
As documented, string.gub( 'foo', '%a', '%1' ) should raise an invalid
capture index error because there is no capture with index 1 in the
pattern. But in fact it treats %1 as %0 in this situation. The ustring
library should match this behavior.
This patch also adds some tests for the behavior of gsub with table and
function replacements when the pattern does have captures.
Bug: T207623
Change-Id: Ie3e6c2eafa4a05989815c62c7037167642581751
Its a command line script, so echoing is not an XSS. It can
do malicious things if given a malicious command line argument,
but that is by design
The last remaining phan-taint-check warning is due to a bug
in the plugin.
Bug: T202380
Change-Id: I19a07f741980a7e4d5e8458395c67523d240d221
If the replacement table or function results in a value that isn't a
string or number (or nil), string.gsub raises an error. Have ustring
raise the same error.
Bug: T195326
Change-Id: Ic36f9f5d7adc0c14e7a4a94d3747335107acd8b6
normalization-data.lua is updated to Unicode 8.0.0 (libicu57).
charsets.lua is updated to match the character classes used by PCRE 8.35,
which seems to be Unicode 6.3.0.
upper.lua and lower.lua are still based on whatever ancient version of
Unicode is used by mb_strtoupper and mb_strtolower in HHVM 3.18.6.
Bug: T177498
Change-Id: I00b471176e1fd21123c22d187ff222928819e459