The PHPDoc/JSDoc updates are mostly about generic "array" types that
can be made more specific.
In PHP we can remove documentation when it is 100% identical to the
type declarations in the code.
A few mistakes are fixed as well, e.g. a missing "null".
This patch also made a major mistake visible. It looks like the
$geshi2pygments compatibility map was broken since 2018. The array was
changed from values to keys via I7a852dd and some usages updated, but
one was forgotten.
Change-Id: I480999d21f2f69cba84166bb877aa75882778966
Prior to the shellbox migration, if during the parsing of a page,
pygmentize failed (i.e. non-zero exit from its local shell command,
pretty much the only way a php shell exec could fail), then
SyntaxHighlight would fallback to outputting a preformatted plain
`<pre>`.
The logic still exists in the code, and is still triggered for cases
where the command reached shellbox and its result was "successfully"
communicated to MediaWiki (HTTP 200), with the boxed result reporting
the non-zero exit code on the shellbox server.
However, the more likely scenario in the new setup is that the command
times out or never reaches the server in the first place, in which
case we don't get any shell exit code. Instead, we get a Shellbox
exception since the result is unknowable.
Instead of fatalling the entire pageview with a PHP exception and
HTTP 500 from MW, use the same graceful fallback.
Bug: T292663
Change-Id: Icaa8c34ff97ad8a99d044beab529ef943071269c
Use the '--json' flag to get Pygments to output its list of supported
lexers in a machine-readable format. Support for this flag was added (at
our request) to Pygments and included in the 2.11 release[1].
Tested by running updateLexerList.php and confirming empty diff.
[1]: https://github.com/pygments/pygments/issues/1437
Change-Id: I0f1d7fceca9034e6034bafa6a8dd312b99d379d1
When using a non-bundled Pygments (which is required on Windows, as the
bundled version is an ELF binary), we call into the Pygments executable
to generate the list of supported languages (lexers). This list seems to
occasionally include carraige returns, causing some languages to not be
processed correctly. Trim those CRs out so the language list is
accurate.
Change-Id: If8b1f145dd10e2c4707d6d32927e85d1d2459f15
Python on Windows requires the SystemRoot environment variable in order
to initialize its internal RNG, so make sure that is passed along to the
subprocess.
Bug: T300223
Change-Id: I170ce627a3f00c023f4b1f11613f4fe2cb17bd31
Follow up to ae07430. The method needs to be public so that
WANObjectCache can call it from a callback, but we don't expect any
external callers.
Follows-Up: I424926d071e1cfd454a0c2d45a83693f41bdea55
Change-Id: Ia96d3132782435c693d2eaa77fd551fe9590b113
* Add rationale for each cache key's strategy being in Memc vs APCU.
* Extend pygmentize-lexers from 1 day to 1 week. It rarely changes
and already varies by version. Few things survive the day, but
there's not a reason to explicitly expire it sooner I think.
* Add a layer of Memc to the pygments-version APCU cache given that
it has a short expiry and thus relatively high miss rate.
The main rationale for this is noise in mwdebug logs since this
is currently the only thing we log by default in Logstash with prod
severity (exec INFO) during every pageview (after a php-fpm restart
which clears APCU). By adding Memc here we lose less of the cache
churn by reviving it via Memcached, and we keep the sense of there
being nothing in the logs "by default" at prod severity after restart,
e.g. don't get used to any fatigue.
Unlike the other cache keys and hooks, getVersion is the only
thing that gets called widely regardless of whether syntaxhighlight
is in use on the given page.
Change-Id: I424926d071e1cfd454a0c2d45a83693f41bdea55
All of the interactions with `pygmentize` have been refactored into a
new class, conviently called Pygmentize. It is responsible for getting
* pygments version (cached in APCu for 1 hour)
* generated CSS (cached in WAN by version for 1 week)
* lexer list (cached in APCu by version for 1 day)
and actually highlighting stuff! Most code paths differentiate whether
we're using a bundled version of pygments or one that has been
explicitly configured. If using the bundled one, we take shortcuts since
we already know the lexer list, have the CSS generated, etc.
ResourceLoaderPygmentsModule is added to switch between loading
generated CSS from the bundled file or Shellboxing out to get it from
pygments.
Bug: T289227
Change-Id: I2e82e5aa2a71604b87ffb4936204201d06678341