Commit graph

275 commits

Author SHA1 Message Date
jenkins-bot c3d13130b7 Merge "Speed up PHP mw.ustring.gcodepoint" 2017-03-09 23:37:54 +00:00
Brad Jorsch 7f94d88733 LuaStandalone: Fix signal handling
I252ec046 noticeably broke things by adding a dependency on the pcntl
functions, which tend not to be present under Apache.

It also subtly broke exit handling by using proc_close()'s return value,
which PHP mangles in such a way that we can't tell the difference
between an actual XCPU kill and exit( SIGXCPU ). This one wasn't noticed
because the pcntl functions interpret everything proc_close() is going
to return as a signal kill and we didn't test the 'exited' code path.

I'm not sure what was going on in I57cdf8aa since it provides no details
about what it was trying to fix, but that would have broken signal
handling in the other way: Ibf5f4656 worked because proc_open() on Linux
executes the command by passing it to /bin/sh -c, and that shell is
going to turn any signal that kills Lua (e.g. the SIGXCPU) into an exit
status of 128+signum.

To avoid proc_close()'s broken return value while also avoiding the
race, we can loop on proc_get_status() until $status['running'] is
false.

To have signals that kill Lua actually be interpreted as signals, we
have two options: add an "exec" in front of the command so proc_open()'s
/bin/sh -c is execed away, or detect shell-style signal reporting and
convert it. We may as well do both.

Bug: T128048
Change-Id: I8a62e1660fe1694e9ba5de77d01960c1ab4580aa
2017-03-09 23:16:28 +00:00
Brad Jorsch 5e28f67e88 Speed up PHP mw.ustring.gcodepoint
It seems to be over 200 times faster to iterate over the array instead
of shifting off the front.

Change-Id: Id29a4739ae2bd5dac4197e110ea73f74794e6d9f
2017-03-06 12:53:25 -05:00
WMDE-Fisch 7e4997758e Replace deprecated suppress warning methods
Change-Id: If633b8007890e0bfd790b506feaf72c9fd271708
2017-02-15 14:52:38 +01:00
Brad Jorsch fe094e7bae Update ustring data tables
normalization-data.lua is updated to Unicode 6.3.0.

upper.lua and lower.lua are updated to match HHVM 3.12.1's mb_strtoupper
and mb_strtolower. I don't know what version of Unicode that might be,
but it seems old.

Bug: T86096
Change-Id: I1a0c8be2756f86db5f36dd67319a1f79aea98b3e
2017-01-21 03:26:27 +00:00
jenkins-bot ae677fbc0d Merge "Ustring: Let gcodepoint work with moderately long strings" 2016-12-16 00:42:02 +00:00
Brad Jorsch db07787390 Cleanup backwards-compatibility code
https://www.mediawiki.org/wiki/Extension:Scribunto says that master
requires 1.25+, so let's remove checks for stuff that was added before
that.

* PPFrame::getTTL() was in 1.24.
* PPFrame::setTTL() was in 1.24.
* PPFrame::isVolatile() was in 1.24.
* Parser::fetchCurrentRevisionOfTitle() was in 1.24.
* ObjectCache::getLocalServerInstance() was added in 1.27, so restore the call to ObjectCache::newAccelerator() as BC.

This also removes BC with the php-luasandbox extension older than 1.6, which
was released before MediaWiki 1.22.

Bug: T148012
Change-Id: I36e37f3b65d0f167e1d28b00e0842d9721feee31
2016-10-13 11:07:44 -04:00
Aaron Schulz 3660ec17ba Clean up ObjectCache calls
Change-Id: I95b2d4d0f94a2e7f42372615ea9c612845502b30
2016-10-11 14:06:38 -07:00
Brad Jorsch 629f11d0dd Fix pure-Lua ustring and empty patterns
An empty pattern isn't "safe" since it could match in between the
bytes of a UTF-8 character.

Also, it turns out there's a bug in PHP <5.6.9 preg_replace() that we
need to work around too.

Change-Id: I282e5909e4663461d60c5386693db182de2fd44c
2016-10-05 14:32:27 -04:00
jenkins-bot c48bda0698 Merge "Add handling for PCRE errors in ustringGsub" 2016-10-05 18:15:10 +00:00
Marius Hoch 0f4db74148 Add mw.hash to Scribunto
Provides a simple wrapper for PHP's hash() and
hash_algos() functions.

I will add docs to the Lua reference manual once
this is merged.

Bug: T142585
Change-Id: I6697463974a175e99f9b77428a1085247165ebc9
2016-08-18 04:39:04 +02:00
Brad Jorsch ba19a82c06 Add handling for PCRE errors in ustringGsub
Bug: T130823
Change-Id: I6fab71c82ddab92daf6b369cb9857d9892f2d246
2016-07-15 15:43:58 -04:00
Brad Jorsch d643f40de9 Ustring: Let gcodepoint work with moderately long strings
For the PHP implementation, return the codepoints as a table instead of
multiple return values that get table-ified in Lua, to avoid hitting
too-many-values stack limits.

For the pure-Lua version, inline most of ustring.codepoint instead of
calling it to avoid what's effectively "{ unpack( stuff ) }".

Bug: T118687
Change-Id: I105f388cc23ab55d4124739700ef89d5354b7dbc
2016-07-15 19:35:58 +00:00
Kunal Mehta 9275cc14fb Expose ParserOutput::addWarning() to modules
Bug: T137900
Change-Id: Ibdd2506f4ab27f531ae49187bc57ba0d5c56b7cc
2016-06-16 15:48:53 -07:00
Jackmcbarn f4501ccd22 Only use mw.ustring when necessary
mw.ustring is really really slow. I've discovered that in a lot of modules
on enwiki, upwards of 2/3 of the total runtime gets used when mw.html
calls mw.ustring.gsub. This change checks whether any Unicode characters
are present, and if not, calls string.gsub instead.

Change-Id: Ia50061584be3901ae7428354c449236225c318db
2016-05-30 18:38:32 +00:00
Brad Jorsch c9de00aeff SECURITY: Don't escape strip markers when escaping attributes in mw.html
Core strip markers were changed in T110143 to include characters that
are normally encoded in attributes, however we want to pass them through
here so they can be unstripped correctly in the output wikitext.

This fix makes "Strip markers in CSS" parser test pass again.

Bug: T110143
Bug: T135961
Change-Id: I1353931a53c668d8a453dfa2300a99f59fdb01c5
2016-05-22 21:40:32 -04:00
Brad Jorsch aa4d72e3ff Fix uncontroversial phpcs errors
The following continue to be ignored:
* Generic.Arrays.DisallowLongArraySyntax.Found, because I'm not sure
  Scribunto is ready to abandon old version support in master.
* MediaWiki.ControlStructures.AssignmentInControlStructures.AssignmentInControlStructures,
  because it's overly strict for its purpose.

Squiz.Classes.ValidClassName.NotCamelCaps isn't ignored globally, we
just ignore it explicitly every place it's needed.

Change-Id: I307668da6ef7b3e23da19b1fd1e08914239b99b3
2016-05-18 16:31:28 -04:00
jenkins-bot c753698eaa Merge "Provide a standard way to get the target of a redirect page" 2016-05-12 19:32:17 +00:00
Brad Jorsch 507827aaf5 Avoid fataling Special:Version if LuaSandbox is enabled without the PHP extension
Such a configuration is completely broken, but it's easy enough to
detect and avoid here.

Bug: T131910
Change-Id: I0bf108ec191a59f5506c0cdab00f3e5e68158ed5
2016-04-06 11:20:20 -04:00
Brad Jorsch b3da8a698d Add toNFKC and toNFKD to mw.ustring
This also makes some updates to make-normalization-table.php to handle
the move of UtfNormal to a separate library.

Bug: T126427
Change-Id: Id4985c3ca441cf92f08ba1f1af85c762ba43d7d2
2016-04-02 15:22:42 +00:00
Jackmcbarn b82ed4aa7d Restrict cached results to their original frame
When caching results from frame:preprocess and frame:expandTemplate,
restrict the scope of the cache to the frame object that was used. This
allows the integrity of the empty-frame expansion cache to be maintained
while also allowing parent frame access. This change is the equivalent of
I621e9075 in core.

Change-Id: Iae4c00e7e19ba12cfdaac135be16c991d9d0cea1
2016-03-09 11:27:23 -05:00
Ricordisamoa 1573bee81a Provide a standard way to get the target of a redirect page
The new Scribunto_LuaTitleLibrary::redirectTarget() method is
used by mw.title objects as read-only attribute 'redirectTarget'.

If the page does not exist or it is not a redirect, the value
of the attribute is `false`; otherwise, it is the target of the
redirect page, as mw.title object.

This is a proper alternative to parsing wikitext as it is done in:
https://en.wikipedia.org/wiki/Module:Redirect

Bug: T68974
Change-Id: Id4d9b0f8c1cd09ebc42c031d4d3fc0c33eea44aa
2016-03-01 14:30:22 +01:00
Brad Jorsch 31dd4d535f Pass language to SpecialVersion::getVersion()
The language used should be $parser->getTargetLanguage(), not the user
language.

Soft-depends on Id14733aaef3e52a2e315bffe74baeb926d46e238.

Bug: T127233
Change-Id: I712e048367d9d65fd223cb085dbf9e5fceca286c
2016-02-24 00:11:17 +00:00
Darian Anthony Patrick 00ed2a567b Update lua binaries to patch CVE-2014-5461
These binaries were compiled from a manually lua-5.1.5 source tree.

Linux binaries were built by Anomie. Mac OS X and Windows by dpatrick.

Bug: T72541
Change-Id: I6af0f042c491785cce26afc186a148c83c4f3414
2016-02-22 09:38:22 -08:00
Jackmcbarn dc9446b84d Remove loadedLibraries
Nothing actually uses this, so I'm not sure why we ever kept track of it.

Change-Id: I60480b96a83731c7b25aed55099886a86efc08b1
2016-01-19 02:25:25 +00:00
Brad Jorsch 29266a9a0f Use correct variable in ustring.lua
Change-Id: Ic576b8c31c487c106593050538f9f2cc5b722b62
2016-01-02 10:49:48 -05:00
jenkins-bot b8830a3e57 Merge "ustring: Handle "empty" charset like Lua does (part 2)" 2015-10-30 16:34:54 +00:00
Ori Livneh b5df651e1e Scribunto_LuaSandboxEngine::getResourceUsage(): call load()
This is required for ensuring $this->interpreter is available. See
::getLimitReportData(), which does the same thing.

Change-Id: I275b093dd7d5f4873ec4b912823322e6e533cae1
2015-10-29 16:52:21 -07:00
Ori Livneh 7e63874c5c Move getResourceUsage to Scribunto_LuaSandboxEngine
Fix-up for I6a4ed03c126.

Change-Id: I69e9218c6a3da6ca2a6f13e5911fee1c78a8f4a0
2015-10-29 16:29:00 -07:00
Ori Livneh 930421d242 Add ScribuntoEngineBase::getResourceUsage()
Introduce a method, ScribuntoEngineBase::getResourceUsage(), which may be
overridden by script engine implementations to provide CPU and memory usage
data.

Change-Id: I6a4ed03c1261f43a7ce7de6f274c32c450e66abb
2015-10-29 03:59:07 +00:00
Brad Jorsch cd618c7a92 ustring: Handle "empty" charset like Lua does (part 2)
Lua actually treats a close-bracket at the start of a bracketed
character class as a literal, rather than using it to close the
character class. Probably unintended behavior, but it happens.

Also, have the pure-lua version throw our more informative errors on
error even when falling back to string.find and the like, and fix some
other weird edge cases that came up in testing.

Bug: T95958
Bug: T115686
Change-Id: Iab783d4a3e58b1514cc09729d4a71c2cb1242ee8
2015-10-16 09:26:55 -04:00
Jan Berkel fb20934b16 Fix a problem with simple pattern detection
A string with a dot pattern is only "simple" if
followed by +, - or *. The end of string condition was not checked
properly.

Change-Id: Ia10b9164caeabe464c76441cc82eef37a7013048
2015-10-07 10:27:45 -04:00
Jan Berkel 7c5454b36c Fix off-by one error in gsub
Change-Id: I49c0386970e007271d23087fd112580af7b21c9c
2015-09-23 17:41:15 +01:00
Ori Livneh eec31286bc Fix-up for I32bad5fd9
Don't return nonexistent variable $content, and don't bypass loadString / callFunction.

Change-Id: Iae493606d0167853c3c79536e35eeb23a54bb6d1
2015-08-25 17:36:26 -07:00
Ori Livneh 7bd4959b55 Cache Lua code files in APC
Cache Lua libraries in APC (if available) for up to 5 minutes. Always check the
file's mtime to avoid serving a stale copy.

This code path is hot enough that using APC makes a difference.

Change-Id: I32bad5fd9443c1759fe6dc91f8df2ac2f120d75b
2015-08-25 16:28:36 -07:00
Jackmcbarn 828c6cf513 Prevent leaking title fragments across invokes
Bug: T106951
Change-Id: Iace5d75deac3d8ffde6f3dec6a4f910dcb77d1e2
2015-07-27 10:46:23 -04:00
Jackmcbarn bd5e46b941 Check content model instead of title
Make Scribunto compatible with storing content model in the database, by
checking for it directly instead of guessing it based on the title.

Change-Id: I94ae07bc47273fbf65d64b2909e5895c1c3fd7e9
2015-07-19 22:16:21 -04:00
Mr. Stradivarius d59d852290 Fix accidental global in mw.uri.parseQueryString
The result of the type function should be compared against the
string "table", not the global variable. This bug probably went
undetected until now, as "table" is also the global variable for the Lua
table library.

Change-Id: Ia28fa10388bfc587d95b522bfa8f3524b4a3ee5f
2015-07-15 23:07:37 +09:00
jenkins-bot 7cf15f43e5 Merge "Display backtraces in the Scribunto console" 2015-07-01 17:01:36 +00:00
Jackmcbarn 52d4915201 Display backtraces in the Scribunto console
When the Scribunto console produces an error, display a full backtrace
instead of just the error message.

Bug: T74462
Change-Id: I305438284eae8e19a51a70b1e83d54e4831de396
2015-07-01 12:21:24 -04:00
jenkins-bot c582834a09 Merge "Mark metatables from mw.loadData" 2015-06-30 20:31:53 +00:00
Jackmcbarn ca7a84b5b2 Fix some PHPCS issues
Change-Id: I5a44d07553d45bc01db070c99856b35a3d275bd1
2015-06-30 13:14:58 -04:00
Jackmcbarn a4cb7efd0d Mark metatables from mw.loadData
Add mw_loadData=true to metatables set by mw.loadData, so that modules can
distinguish them from other tables.

Change-Id: I0795d738891c85600af2621908376474ae21b3fe
2015-06-27 22:38:23 -04:00
Ori Livneh d426627c9b lint: 'if(' => 'if ('
Change-Id: I056ff6bbc5f992bddfd7e3bd82803de107651b80
2015-06-20 21:38:56 -07:00
Brad Jorsch 58d722bcdf Allow nil in mw.text.jsonEncode
If it somehow gets in there (e.g. via a crafty __pairs), let it through.

Change-Id: I9f79dbb1a09cd62b2a8f4b6beb84a3e2f1c85560
2015-06-16 16:36:30 +00:00
Tim Starling e7f5aae520 Fix race condition in SIGXCPU handling
Marius found a race condition in the handling of SIGXCPU: the pipes
would close, causing the read/write to complete, before the status of
the process changed, so the status would randomly be "running" for a few
milliseconds after proc_get_status() was called.

So: terminate the process unconditionally after an I/O error. Get the
exit status from proc_close(), since that's the only way to get the
status of a terminated process while simultaneously waiting for it to
exit. Also fix signal identification as in unmerged patch I57cdf8aa.

Change-Id: I252ec046e82063a868c1094e81705cb5e847db92
2015-05-25 16:40:31 +00:00
Brad Jorsch 4669e43135 ustring: Handle empty charset like Lua does
Both '[]' and '[^]' give a rather odd error, but it's probably best to
follow suit.

Bug: T95958
Change-Id: I3310da55f655537c9082fc9039003f6b2d31eff4
2015-04-13 18:20:33 -04:00
Jackmcbarn 6ffde66c77 SECURITY: Sanitize the content of Lua backtraces
Bug: T85113
Change-Id: Iede661a34f4ec2f384bd0407e2fb8f271ff54a77
2015-04-01 10:02:19 -07:00
Kunal Mehta 3f5f3e247f Use full <?php instead of short <? in ustring generation scripts
Change-Id: Ida6bc4ee1803763b284fdaa7c63769a146fec6ad
2015-03-17 18:16:20 -07:00
Brad Jorsch 3d51662881 Rewrite error handling to avoid OutputPage::addInlineScript
This is apparently unofficially deprecated, and we can do things a bit
more straightforwardly by using ParserOutput::addJsConfigVars() to
communicate the error messages to the JS.

This also takes the opportunity to move "ext.scribunto", which is mostly
about errors, to "ext.scribunto.errors".

Bug: T75618
Change-Id: I1577dab2dab1bd79cb127879de141fdbb8963aeb
2015-03-16 16:08:44 -04:00