mw.text.unstrip is too broad, it's allowing for unstripping things that
cause problems when unstripped (e.g. bug 61268). Since the original
request was only for unstripping <nowiki>, let's add a function that
does only that.
We should also add an interface to StripState::killMarkers(), instead of
requiring everyone to roll their own work-alike.
Then, to fix the bug, we can make mw.text.unstrip be the combination of
the two. This is the most like the original behavior of mw.text.unstrip
(removes all strip markers, replacing them with text where applicable)
without causing issues.
Bug: 61268
Change-Id: I3a151fd678b365d629b71b4f1cb0d5d284b98555
There are like a billion things missing in the inline documentation
of this extension. Wow. This is what I can do for now.
Change-Id: I019c24d13cf5cb22dde4d710b86ef8f976e1ec96
Scribunto currently supports libraries with PHP callbacks that are
loaded on startup, and pure-Lua libraries that may be loaded from the
module with require().
This change allows for libraries with PHP callbacks to also be loaded
with require().
Change-Id: Ibdc1f4ef51b1c8644c3d4c98d57755b5c06447a5
Use $title->exists() to see if a title exists, and use
$parser->fetchLatestRevisionOfTitle() when available, so that
TemplateSandbox works with title.exists and title.getContent().
Bug: 70495
Change-Id: I732da9daccdc35b11d726818c3a7c81f5e810a32
If LuaSandboxFunction::call returns false, it's an error on PHP's part.
Throw a "real" exception so that we can see what's causing it in server
logs.
Bug: 71045
Change-Id: I7185e186d3e0af6e467b73ea1ef13417ca96b088
Cache patterns and the regexes they become, avoid revalidating the same
pattern multiple times, and don't bother checking if something is a string
when we just made it one.
Change-Id: I1a61dd0a36eb449c8acdc8c1be68aae793f172d3
It's not necessary, it makes the output bigger, and some pages have enough
elements with CSS that it does make an actual difference.
Change-Id: I80d471899c7e04a8a4876c205198a8c0d0b1f281
When calling getContent() on the page currently being viewed, set
vary-revision on the parser output, as is done when a page transcludes
itself.
Change-Id: I908f095935067dc24dd561192b0699c602cb605f
In Ia4d58f44, the code enabling __pairs to work no longer ran inside
MWServer.lua, so it hasn't worked right for serialization since then. This
restores the correct behavior.
Change-Id: Iea31ab363957f5f69838d6715527cf822c15fa94
Add a way to fetch cascading protection information from Lua without
needing to call the CASCADINGSOURCES parser function.
Change-Id: I1b3ac18af11d3066f78d27b31da8d6709a6a2631
The pure-Lua ustring pattern matching functions short-circuit to the
much faster string library when the pattern would match the same against
the raw bytes.
A pattern like "[^a-z]" can match a partial UTF-8 character when applied
bytewise, and so must be detected as unsafe.
Let's also directly test the pure-Lua module, instead of me having to
comment out lines in Scribunto_LuaUstringLibrary::register() whenever I
want to test them.
Change-Id: I91ed3374aadfea379b9db2e13b4248ab20df509e
Simplify the logic in mw.text.listToText so that we don't need to add or
remove anything from the original table we were passed.
Change-Id: I3efcbba1b9adc9a9e32e366e355cb742376cd91b
The pattern used by cssEncode is unnecessarily complicated. Simplify it by
using a negating pattern.
Change-Id: I5dc7169efea63473e9e23a1450d2941e434a00d8
Add an mw.dumpObject() method, which converts an object in the same manner
as mw.logObject(), but returns it instead of adding it to the log buffer.
Change-Id: Ie9fbd24d9d8d13ee2ddf8052679010892f61e1e0
People have been complaining that they can't find the log data anywhere.
The new parser limit report seems a good place to show this information.
Change-Id: If2abf27f7779d92ff7c7a1f32b2a54a5de521678
Clean up trailing whitespace from all of our code, and add comments
indicating that apparently unused variables are ScopedCallbacks.
Change-Id: I8e5997797cc7b1c64c5351ec112a18f30edc8fef
Two similar bugs are handled here:
* mw.getCurrentFrame() doesn't work when the module is loaded (only when
a function is called), which breaks os.date and os.time at module
scope since I59ad364d.
* mw.getCurrentFrame() gives access to frame args from inside
mw.loadData, which allows for data leakage between #invokes.
Bug: 67498
Bug: 65687
Change-Id: I82dde43e2601b59c03c6ed4b9365829c40a953a5
Some functions in mw.html accept numbers as arguments, but later fail when
constructing the string. This disallows numbers in attribute names, since
they aren't valid anyway, and fixes the remainder of the cases to properly
build the string.
Bug: 67201
Change-Id: Ie7bcbb9d8df580dd8793681f78a8b0719d8a287a
Lua's string functions tend to auto-convert numbers to strings. We
should do the same in mw.ustring.
Bug: 67201
Change-Id: Icd3c5e93bac19dafd78d737ec9b315daba9f1729
Certain wikitext, such as that containing Cite.php <ref> or <references>
or the #tag versions of the same, should not be cached. This uses the
isVolatile method added to PPFrame in I95b3cf87 to avoid caching the
preprocessed output of such wikitext from frame:preprocess and similar
methods.
Bug: 46815
Change-Id: I1084f87fd863eb22f2f3f3d3ff308b24e20a08ef
When os.date, os.time, or mw.language:formatDate are called, set the
appropriate TTL on the output. This needs I412febf3 in core to function at
all, and I3f5a80aa in core to function with formatDate.
Change-Id: I59ad364d502fc247500d94c5606516ad9f98a24d
Rather than calling error() when nils get passed to mw.html methods,
either remove whatever it was that the nil would go to (if that makes
sense), or just do nothing. The seemingly inconsistent use of "not x" and
"x ~= nil" is to allow any falsey value where it wouldn't be ambiguous
(such as class names), but not where it could be (such as attribute values).
Bug: 62982
Change-Id: I76773abbb4394aa9bb8c8a08445e019cade3b2bf
If someone goes and adds aliases for namespaces that don't actually
exist (as was done in I94c34799, for example), Scribunto will run into
issues when trying to create its mw.site.namespace objects.
Let's ignore those bogus aliases so we don't go breaking everything just
because someone did something stupid.
Change-Id: I16acd97f587de320cf61becb829cc66794cbb119
When tables are passed from Lua to PHP, their metatables are lost. Because
of this, they need to be kept inside of Lua to allow the __index
metamethod to return a method to be called by #invoke.
Bug: 64141
Change-Id: I0840bc12b25dee72828ec97d2b205812e4929f2b
LuaStandalone only uses 2 functions from mw.lua, so move them to their own
file to avoid running the whole thing twice.
Change-Id: Ia4d58f44be17f7a71666dbe750e66d9d90cb5c2f
Creating and calling an anonymous function to create a scope is prone to
breakage, and only works because the last token before it is a numeric
literal. Do...end is designed for this purpose, so use it instead.
Change-Id: Ic33321086d5469bf97301b434c5a660f04120662
From wikitext, $parser->callParserFunction() will always get an array of
strings with at least an element [0]. Let's match this from Scribunto:
stringify numbers, and require that [0] (although in Lua it'll be [1]).
Also fix an old broken unit test.
Bug: 63597
Change-Id: Ie7ac34ae4bce70cec455d90c3f02a658644f6866
Use modname instead of the nonexistent name in the error message if
require() is passed the wrong type of parameter.
Change-Id: I2e96d283e34a16e4675141ce8ccddbcc045ef2a1
When displaying a nosuchfunction or nosuchmodule error, include the name
of the nonexistent function or module.
Change-Id: I17fc2c68dc8267302a82eee3cb2c5df9b5a3c46c
This commit fixes an error with using a mw.title object referring to a
mainspace page as the title argument to frame:expandTemplate(), by
adding a leading colon to prevent the function from searching in the
Template namespace.
Bug: 47601
Change-Id: I4cdc05571598bf7998f4cf0f2691bf86188c3c5d
It's possible to pass information between multiple #invokes on a page by
having the first call math.randomseed with one of a set of known seeds
and then having the second examine the output from math.random to
determine which of those known seeds was used.
Prevent that by calling math.randomseed( 1 ) when invoking (see the bug
for details on why that seed). But avoid doing so if e.g. a
frame:expandTemplate() call results in a recursive invoke.
Bug: 62291
Change-Id: Id01cb63eca52ced29bf4efebc38beb9f159b7b0e
Remove the 6000 character per page limit of mw.language:formatDate(). It
only exists because ParserFunctions has something like it for performance
reasons. Since Lua has a maximum execution time, there's no reason that it
needs this as well.
Change-Id: I42ae4f51295135007c6e2edc66ec36b7d96e3be3
Since the code related to titles in messages was removed from
mw.message.lua, remove it from here as well. Titles have no effect since
only the plain format is available.
Change-Id: I0c96a4e831abe61100b48cb6a898ad8dbffd8a72
Include the text of the title being complained about when returning an
invalid title error from expandTemplate.
Change-Id: I2261f9574557c3ae514c39cea71f9777f8f9f431
Various methods are throwing exceptions when passed invalid language
codes. Those need to be caught.
And we should really add unit tests for the mw.language library, too.
Doing so exposed another bug (in lang:gender), which is also fixed here.
Bug: 62242
Change-Id: Ib7d257cbb1ce179c510273526910d6ac5f3cac5d
The LuaStandalone interpreter needs to keep a mapping from integers
returned to PHP to the corresponding function. But if it never releases
these functions when PHP no longer has any reference to them, it can
result in Lua running out of memory if a module with a large number of
functions is invoked many times in one page.
The fix here is to track which function ids are referenced from PHP, and
periodically send the list to Lua so it can remove any that are no
longer used from its cache.
This also takes care of another issue where having multiple interpreter
instances and passing function objects from one into another could call
the wrong function in Lua.
Bug: 51886
Change-Id: I4f15841051f7748d1d6df24080949e5cbd88f217
Change Ie065c7b5 added an option to show profiling data at the bottom of
preview pages, and with it new hooks to gather and format this data in a
more structured way than is possible with ParserLimitReport. This change
adds support for the new hooks.
Depends-On: I7799616a602d90e1b8d3f0ece35811ca387bade7
Change-Id: Idffd2d78f9a0217c99c07cbbfc844d6daf0172f7
Message formats other than plain should have never been exposed in this
way, as they allow link tables, etc. to be bypassed and serve no useful
purpose. This removes them, and also removes title, as it serves no
purpose without them.
Bug: 60758
Change-Id: I96284ffbe986a9cd92d2bde1ffdb746029bad989
If we don't do this, then the section edit links point to the wrong page
if we expand a template that contains section headings.
Bug: 55525
Change-Id: I00bda935be3e8b9c0f86fd0f131814207fbb34a7
Include a protectionLevels variable in the output of the mw.title.*
functions, containing the contents of the title's mRestrictions
array (i.e., its protection levels)
Change-Id: I79c9fed64bacfc90aee1d411a3e1b47e44c99755
PHP can't handle having arrays/objects or functions as keys in its
arrays, so make sure we don't try to pass them from Lua. Booleans aren't
really well-handled either, so let's disallow them too.
Also, add tests for proper stringification of floats and infinities when
those are used as keys.
Note this behavior change is needed to match the change in LuaSandbox
for fixing bug 54527, but isn't itself a security issue.
Change-Id: I1e2951bbe8cb78358650ad377bf7119fcac4485d
A module for building complex HTML from Lua using a
fluent interface. The module is originally from enwiki,
but the authors allowed us to reuse it under GPLv2+
(as stated in the file).
The module will be loaded per default and comes with
unit tests.
As discussed on wikitech-l:
http://lists.wikimedia.org/pipermail/wikitech-l/2013-December/073320.html
Change-Id: I7c8d4378091c13d5ace0dd1fcbb4e27163e8c896
Apparently this is useful on Commons, where they would like to iterate
over all language names in some of their templates.
Bug: 47833
Change-Id: I6e3291bedc72da6630c485ea9bf381d8d2f5453a
This field already exists in PHP with exactly the content requested in
bug 47089, so we may as well expose it on the frame object.
Bug: 47089
Change-Id: I672820589f6ebc7c4daad29b5eb156733a5bc5cc
It's already possible to detect whether the current template is being
substituted via ParserFunctions (see [[en:Template:Ifsubst]]), and a
similar trick works with frame:preprocess. So we may as well provide the
flag directly.
Bug: 47828
Change-Id: Id06d27c6283ee589a8830b78c04e56978e0ac6da
Specifically:
* String conversion in non-URL contexts (e.g. .prefixedText) uses spaces
instead of underscores.
* Setting .fragment now applies the same transformations that are done
(in PHP) by mw.title.new.
Bug: 56217
Change-Id: I12e354636bcde3327864088175fb4de61aecc81a
The PHP call that makes mw.site.namespaces work case-insensitively
doesn't handle non-standard spaces/underscores. So standardize them
before the call.
Bug: 56216
Change-Id: I4758478b126858fb581614f64eb15472f42fef51
The following are now correctly escaped:
* Blank lines (including those with only tabs)
* ---- at the start of a line
Bug: 53658
Change-Id: Ib000ff4f21f76c310741de89de0e0b66f6450344
The following are now correctly escaped:
* Space at the start of a line
* Start-of-line characters after \r
* Magic links such as "RFC 123" with non-space whitespace
* URIs that don't use "://", such as "urn:foo"
* Double-underscore magic words
Bug: 53658
Change-Id: I824417e2937dd27cd1e69bd4e74ab7d21a978c75
Current logic is to display the funciton name if Lua provides us with
one, "main" if it's at the main level, or "?" if it's a C function or a
tail call. But we're not handling if it's a Lua call but Lua can't guess
a name for the function; use "?" for that too.
Change-Id: I938b5e5ca55cf4990dbcbb0db8dd8fc93b03bf15
The binaries currently provided were compiled against glibc 2.11+, so
people using CentOS 5 (which has glibc 2.5) are not able to use them.
The binaries in this patch were compiled in VMs installed with CentOS
5.9, and so should work for more people; at a glance, it looks like
glibc 2.3 or later will probably work now.
Bug: 51333
Change-Id: Iac1f2373bbc0bbca8783c82c09eff51ffd5e3761
People have requested a method to log a table as something more detailed
than just "table", to be able to inspect values while debugging.
Bug: 48173
Change-Id: Ia58cab834e87842927a2a13d153ee32473f74086
If the user is on a webhost that has proc_open listed in PHP's
disable_functions directive, we should give a better error message.
Until we no longer support PHP below 5.4, we should do the same for
safe_mode. And since we're doing that, we may as well report any other
warnings if proc_open fails, too.
In addition, this cleans up error handling in
Scribunto_LuaEngine::load() so it doesn't pretend the interpreter is
loaded if getInterpreter() throws an exception. Otherwise other code
winds up with PHP fatal errors trying to access a null value.
Bug: 50706
Change-Id: I2887b722e089fd7a526aa7dcab9c80deb343d8ac
If the parser function returns 'isChildObj', we need to create a child
frame to expand the wikitext returned by the parser function. And when
we pass the arguments to the new frame, we need to pass them through the
preprocessor's newPartNodeArray() first.
Bug: 50863
Change-Id: Ieb7cc7007288de1f7d2cd2458f068affe695e8af
Users seem to expect that mw.language's parseFormattedNumber will act
like tonumber when given nil or other non-string values, returning nil
instead of raising an error. There's no reason not to, so we may as
well.
Change-Id: Ie0ff19efc84ca738e115bbd524bfd92fccf26127
A few edge cases were being incorrectly handled:
* mw.ustring.sub( 'abc', 1, 0 ) returned 'a', not ''.
* mw.ustring.codepoint( 'abc', 1, 0 ) returned 97, not no results.
* mw.ustring.codepoint( 'abc', 4, 4 ) returned 99, not no results.
* mw.ustring.gcodepoint had the same issues as mw.ustring.codepoint.
Change-Id: Ib8c0ef5a8073106eb7d90d0aa0513be4525dca08
Negative values for 'i' in mw.ustring.byteoffset are supposed to count
from the end of the string. But in LuaSandbox, it was actually counting
from two bytes before the end of the string due to a typo.
Fix that, and add some tests for it.
Bug: 50176
Change-Id: Iceee1022a55abd7a08df1ea7843e1277eb02798b
If the interpreter exits before the end of the page, then the call to
Scribunto_LuaStandaloneEngine::getLimitReport() throws an uncaught
exception when it tries to access the interpreter. Catch it.
Change-Id: I7ce4f09b1b2206f13ab0f422de35e0b69a3b24d5
The "%f[set]" frontier pattern has been in Lua 5.1 since the beginning,
but was undocumented until Lua 5.2. And the code is even unchanged from
5.1.0 to 5.2.1. So there's no reason not to implement it in ustring too.
Note the changes to UstringLibrary.php are somewhat large, because it
splits the "convert a Lua bracketed charset to PCRE" code into a
separate function and it changes the handling of mw.ustring.find's and
mw.ustring.match's 'init' parameter from "substring, match from 0, then
add back on $init" to "use preg_match's $offset and use \G instead of ^
where this matters". Both of these are necessary to properly support
%f.
This also fixes a bug in the pure-Lua code (not used in Scribunto)
exposed by the unit tests for %f where %z was matching '\1' rather than
'\0' and %Z everything except '\1' instead of everything except '\0'.
Bug: 48331
Change-Id: Ie0b95ef5b734db53d6adc9de5dae4874f8944c08
The following errors are fixed:
* PHP warning and wrong return value with empty pattern and plain
* Incorrect offsets returned when init is larger than the string length
* Incorrect captured offsets returned when init is excessively negative
Bug: 47365
Change-Id: I9741418287dc727747326d6a19678370ce155a2b
Two related issues:
* The package module was inheriting the loaders from the outer sandbox,
so loaded modules were being loaded into the outer sandbox's
environment.
* mw.loadData was using the outer sandbox's require(), so again loaded
modules were being loaded into the outer sandbox's environment.
Bug: 47300
Change-Id: I48d8dd4784c9a890e3abb6389f96f38e1420dbbb
The documentation, and the expectation of users, is that
lang:parseFormattedNumber() should actually return a number, not a
string.
Bug: 47268
Change-Id: Ieabddd0d9192f1fd8ef7e890d5d6268be9636f38