Commit graph

83 commits

Author SHA1 Message Date
Bryan Davis ee1a6c20d7 Pygmentize: report stderr when exit code != 0 and stdout is empty
Most python error messages are reported to stderr rather than stdout. In
the event of a catastrophic failure executing the Pygments binary we are
likely to need to report stderr so that folks can debug the problem with
the executable.

Bug: T364249
Change-Id: Id5e5dbc515fdcdeb6eec61aacbbb9cbeddc79fab
2024-05-17 11:01:28 -06:00
Umherirrender c0703d33ec build: Upgrade mediawiki/mediawiki-codesniffer to v43.0.0
Change-Id: If2acd67c6275e74e487a2e0ce8d34277a70782ca
2024-03-12 20:48:18 +01:00
Ed Sanders f5dd12e83b Allow linelinks prefix to be any character(s)
Old HTML IDs had to start with Latin letters, but
in HTML5 IDs can use any characters.

Bug: T359214
Change-Id: I6b6733eb07267faca1990bb7445a967405f9327e
2024-03-06 21:32:48 +00:00
thiemowmde 040f45302b Fix GeSHi support, update PHP/JSDocs, use modern PHP
The PHPDoc/JSDoc updates are mostly about generic "array" types that
can be made more specific.

In PHP we can remove documentation when it is 100% identical to the
type declarations in the code.

A few mistakes are fixed as well, e.g. a missing "null".

This patch also made a major mistake visible. It looks like the
$geshi2pygments compatibility map was broken since 2018. The array was
changed from values to keys via I7a852dd and some usages updated, but
one was forgotten.

Change-Id: I480999d21f2f69cba84166bb877aa75882778966
2024-01-22 20:10:04 +01:00
jenkins-bot 73e1073108 Merge "Add a few missing type declarations to properties and methods" 2024-01-22 14:21:55 +00:00
thiemowmde 006455ec37 Add a few missing type declarations to properties and methods
Most of the code in this codebase already uses language-level types.

Change-Id: I1df9439c69eec5ad0b2a9a608729975027a04172
2024-01-22 09:39:30 +01:00
thiemowmde 7a4d59accc Replace preg_replace() with more simple trim()
This does the same as before. Only newlines are trimmed from the
left, but all whitespace from the right.

Change-Id: I6b7c860d8a2fc2a1f28428447ee8f18ab4bbe46c
2024-01-22 09:37:01 +01:00
Umherirrender 890a260032 Use namespaced classes
This requires 1.42 for some new names

Change-Id: I3821d2bca4aa8e7c0fce2a730c070b597c526247
2024-01-05 19:29:59 +01:00
Isabelle Hurbain-Palatin e4e8f8e076 Set hasWikitextInput flag to false
The content of the SyntaxHighlight extension is not wikitext and
annotations should be stripped from it before rendering.

Bug: T341009
Depends-On: I4e9a7a8bec3cb9532ef8a729fd2c6c4acca5d8a0
Change-Id: Ibada54d517830b1112b59513b090dc4bbdc7c917
2023-10-03 20:52:29 +00:00
gerritbot fc431bcd8d Replace some moved Title class uses, now MediaWiki\Title\Title
Bug: T321681
Change-Id: I7a26d8653e2ed41ab65e8b199c748d1a9e6a80d6
2023-08-19 12:26:03 +00:00
Umherirrender 7087e991d5 Use HookHandlers for core hooks
The use of "HookHandlers" attribute in extension.json makes it possible
to inject services into hook handler classes in a future patch.

Bug: T271029
Change-Id: I6df44cf4a160e618a6546fb9eec36070bf4b868e
2023-08-14 20:23:40 +02:00
Daimona Eaytoy 52ac696e25 Replace deprecated MWException
Bug: T328220
Change-Id: Iaf4a9bb4aafc741395d5ccc5a42c6a72b5d42b99
2023-06-08 11:14:08 +00:00
jenkins-bot 81f673bc6d Merge "Pygmentize: Treat Shellbox network loss like non-zero exit code" 2023-06-06 23:55:12 +00:00
Ed Sanders 274cc4ab77 Always use the strict equality flag when using in_array
Change-Id: Iedd51f31db2bc4e5257d211719f8bdcf1abb09dd
2023-06-06 13:35:40 +01:00
Timo Tijhof 682fe922f9 Pygmentize: Treat Shellbox network loss like non-zero exit code
Prior to the shellbox migration, if during the parsing of a page,
pygmentize failed (i.e. non-zero exit from its local shell command,
pretty much the only way a php shell exec could fail), then
SyntaxHighlight would fallback to outputting a preformatted plain
`<pre>`.

The logic still exists in the code, and is still triggered for cases
where the command reached shellbox and its result was "successfully"
communicated to MediaWiki (HTTP 200), with the boxed result reporting
the non-zero exit code on the shellbox server.

However, the more likely scenario in the new setup is that the command
times out or never reaches the server in the first place, in which
case we don't get any shell exit code. Instead, we get a Shellbox
exception since the result is unknowable.

Instead of fatalling the entire pageview with a PHP exception and
HTTP 500 from MW, use the same graceful fallback.

Bug: T292663
Change-Id: Icaa8c34ff97ad8a99d044beab529ef943071269c
2023-06-03 14:25:50 +01:00
Tim Starling 54b02b02e1 Migrate ResourceLoaderSyntaxHighlightVisualEditorModule to a virtual file callback
Depends-On: I97d61b5793159cea365740e0563f7b733e0f16de
Bug: T47514
Change-Id: I10fceeee808e4d08f7ed63afb13b4d87129365c7
2023-05-08 17:15:48 +10:00
Ed Sanders e39f530bfb Document the linelinks attribute and load JS when used
Change-Id: Iaf6e2ef58e85ac92e5fcf9dd3449baae927feb9c
2023-03-09 12:48:55 +00:00
jenkins-bot 14cb8415da Merge "fetchLexers: Pass '--json' to Pygmentize" 2022-12-21 01:00:30 +00:00
Ori Livneh 508e926b5d fetchLexers: Pass '--json' to Pygmentize
Use the '--json' flag to get Pygments to output its list of supported
lexers in a machine-readable format. Support for this flag was added (at
our request) to Pygments and included in the 2.11 release[1].

Tested by running updateLexerList.php and confirming empty diff.

  [1]: https://github.com/pygments/pygments/issues/1437

Change-Id: I0f1d7fceca9034e6034bafa6a8dd312b99d379d1
2022-12-20 23:18:55 +00:00
jenkins-bot 4fead6c331 Merge "Fix lexer list parsing on Windows" 2022-12-20 22:40:50 +00:00
jenkins-bot 362ca84426 Merge "Make the code size limit for highlighting configurable" 2022-11-26 12:54:41 +00:00
Ryan Schmidt 2ae82c7fb7 Fix lexer list parsing on Windows
When using a non-bundled Pygments (which is required on Windows, as the
bundled version is an ELF binary), we call into the Pygments executable
to generate the list of supported languages (lexers). This list seems to
occasionally include carraige returns, causing some languages to not be
processed correctly. Trim those CRs out so the language list is
accurate.

Change-Id: If8b1f145dd10e2c4707d6d32927e85d1d2459f15
2022-11-20 22:46:55 -07:00
libraryupgrader 27ca45e8cb build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 39.0.0 → 40.0.1

npm:
* stylelint-config-wikimedia: 0.13.0 → 0.13.1

Change-Id: Ifd15d37271ad474c948e9187b7b6bccc1f336489
2022-11-16 23:22:01 +00:00
jenkins-bot 1351b2e427 Merge "Count only real highlighting as expensive parser tag hooks" 2022-11-11 09:18:52 +00:00
alex4401 90ee9a9774
Make the code size limit for highlighting configurable
Replacing the HIGHLIGHT_MAX_LINES and HIGHLIGHT_MAX_BYTES constants with $wgSyntaxHighlightMaxLines and $wgSyntaxHighlightMaxBytes respectively, so sysadmins can adjust the limits to their needs if performance is not of their concern.

Bug: T322293
Bug: T104109
Change-Id: I80768d3cb45ac01c004fc812832878c83ca4ecdb
2022-11-03 12:04:35 +01:00
Ryan Schmidt d7a6038211 Fix pygments on Windows
Python on Windows requires the SystemRoot environment variable in order
to initialize its internal RNG, so make sure that is passed along to the
subprocess.

Bug: T300223
Change-Id: I170ce627a3f00c023f4b1f11613f4fe2cb17bd31
2022-11-02 19:02:59 -07:00
Reedy d94fec0141 Remove global class alias
Change-Id: I334a6ce9040385b11f30bb5d25e45bb124e0acce
2022-10-31 23:07:39 +00:00
Umherirrender fe6415836d Count only real highlighting as expensive parser tag hooks
Skip the expensive check,
for example when no highlighting is wanted because there is no lexer

Also all validation of the tag is now processed and
invalid tags also not counted.

Bug: T316858
Change-Id: Ifad9a9a14fae92463c345fb12defb41f14c2e1f3
2022-10-09 18:22:02 +02:00
Umherirrender 0cb5db91b8 Track syntaxhighlighting as expensive parser tag hook
The shell out to get styled text is expensive.
Call Parser::incrementExpensiveFunctionCount to limit the highlighted
text snippet on a page and not reaching a timeout.

This would count each tag and not deduplicate the text snippet to count
only once or if pygmentize needs to call or is in the cache.
This also not affect Parsoid, not sure if the concept of expensive
parser function exists there

Bug: T316858
Change-Id: I8afe61e9be4a34e5f0725a9b65ef43c345e1be5f
2022-09-07 21:41:17 +02:00
Sébastien Beyou b08c0a7cb9 Fix the case of empty <syntaxhighlight /> tags
Bug: T315740
Change-Id: I685806d4e8992a54f17d29a9187807bb30e31ef8
2022-08-21 13:46:17 +02:00
Subramanya Sastry 0eef7add67 Add Parsoid support for syntaxhighlight
* Added Parsoid config, and refactored code slightly to
  add native Parsoid handlers for parser tags exposed
  by this extension.
* Enabled parsoid mode testing on the test file.
* Added html/parsoid sections on a few tests.
* Marked rest of tests as wt2html and wt2wt only since
  html2wt and html2html will fail without a html/parsoid section
  and there is no real benefit to adding them to all tests.
* Added a couple tests to the known failures list:
  - One is because of T299103.
  - The other is because Parsoid always emits attributes in the
    form <tag .. foo="bar"..> instead of just <tag ... foo ..>
    Since Parsoid needs to accept this format that is present on
    wikis, I added a html/parsoid section for this test and
    added the failures to the known failures list.

Bug: T272939
Change-Id: Ie30aa6b082d4fc43c73296ff2ed6cb8c3873f48f
2022-08-08 20:07:46 -04:00
Bryan Davis 3bee59df01 fix: Mark Pygmentize::fetchVersion as public, but @internal
Follow up to ae07430. The method needs to be public so that
WANObjectCache can call it from a callback, but we don't expect any
external callers.

Follows-Up: I424926d071e1cfd454a0c2d45a83693f41bdea55
Change-Id: Ia96d3132782435c693d2eaa77fd551fe9590b113
2022-07-15 19:13:52 -06:00
Timo Tijhof ae074306e8 Pygmentize: Cache pygments-version in memc (in addition to APCU)
* Add rationale for each cache key's strategy being in Memc vs APCU.

* Extend pygmentize-lexers from 1 day to 1 week. It rarely changes
  and already varies by version. Few things survive the day, but
  there's not a reason to explicitly expire it sooner I think.

* Add a layer of Memc to the pygments-version APCU cache given that
  it has a short expiry and thus relatively high miss rate.

  The main rationale for this is noise in mwdebug logs since this
  is currently the only thing we log by default in Logstash with prod
  severity (exec INFO) during every pageview (after a php-fpm restart
  which clears APCU). By adding Memc here we lose less of the cache
  churn by reviving it via Memcached, and we keep the sense of there
  being nothing in the logs "by default" at prod severity after restart,
  e.g. don't get used to any fatigue.

  Unlike the other cache keys and hooks, getVersion is the only
  thing that gets called widely regardless of whether syntaxhighlight
  is in use on the given page.

Change-Id: I424926d071e1cfd454a0c2d45a83693f41bdea55
2022-07-12 05:56:16 +00:00
Tim Starling 956aa8ecd7 Use new ResourceLoader namespace
Extensions using Phan need to be updated simultaneously with core due
to T308443.

Bug: T308718
Depends-On: Id08a220e1d6085e2b33f3f6c9d0e3935a4204659
Change-Id: Ie1356c582baf9a66b868f7349cc71c26f8f1ead3
2022-05-27 03:42:55 +00:00
Reedy 39b4f0c7c1 Namespace rest of the extension
Global alias of SyntaxHighlight left behind for migration

Change-Id: I35b2caa42ac91454abe359949e360d1601748121
2022-03-18 01:42:11 +00:00
C. Scott Ananian e0cece9bc6 Passing a string to ParserOutput::addModules()/addModuleStyles() is deprecated
Bug: T296123
Change-Id: If14866f76703aa62d33e197bb18a5eacde7a55c0
2022-01-11 17:01:22 -05:00
Derk-Jan Hartman 34c936a8d3 Include generated styles before Mediawiki overrides
The order of style inclusion matters, some of our overrides were no
longer in effect.

Follow-up to: I2e82e5aa2a71604b87ffb4936204201d06678341
Bug: T292736
Change-Id: If202c26d2c29994cb3680eb76a86bb7efacc3ff9
2021-10-10 22:30:13 +02:00
Kunal Mehta b8a5dd08ee Expose Pygments version on Special:Version
Change-Id: Ia4eccc4f16873b16e106c8196d7582ca5b27b365
2021-09-10 11:47:28 -07:00
Kunal Mehta af6654e5f9 Port to BoxedCommand
All of the interactions with `pygmentize` have been refactored into a
new class, conviently called Pygmentize. It is responsible for getting

* pygments version (cached in APCu for 1 hour)
* generated CSS (cached in WAN by version for 1 week)
* lexer list (cached in APCu by version for 1 day)

and actually highlighting stuff! Most code paths differentiate whether
we're using a bundled version of pygments or one that has been
explicitly configured. If using the bundled one, we take shortcuts since
we already know the lexer list, have the CSS generated, etc.

ResourceLoaderPygmentsModule is added to switch between loading
generated CSS from the bundled file or Shellboxing out to get it from
pygments.

Bug: T289227
Change-Id: I2e82e5aa2a71604b87ffb4936204201d06678341
2021-09-10 11:47:28 -07:00
Kunal Mehta c8bd606cab Remove dead Shell::isDisabled() check
With "ability-shell" set in extension.json's requirements as of
commit b5a904e2ec this extension will refuse to load if shelling
out is disabled.

Change-Id: Ie8f446fbb33e585ffcc7d0adda1894a5497f2dad
2021-09-02 21:42:13 +00:00
libraryupgrader d6176eb862 build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 36.0.0 → 37.0.0

npm:
* postcss: 7.0.35 → 7.0.36
  * https://npmjs.com/advisories/1693 (CVE-2021-23368)
* glob-parent: 5.1.0 → 5.1.2
  * https://npmjs.com/advisories/1751 (CVE-2020-28469)
* trim-newlines: 3.0.0 → 3.0.1
  * https://npmjs.com/advisories/1753 (CVE-2021-33623)

Change-Id: I63d2cc4c790d9b7d89187b72fed8ed11e89712a2
2021-07-24 02:37:26 +00:00
Alexander Vorwerk 47a18809ef Avoid using ContentHandler::getContentText()
ContentHandler::getContentText() is deprecated and should be
replaced with Content::getText() for TextContent instances.

Change-Id: I8767a925148c31b3a64761f1173a2a85bd28dfe0
2021-05-20 01:01:43 +00:00
C. Scott Ananian 0e26a6b3bf Replace use of Parser::$mStripState, deprecated in 1.35
The replacement, Parser::getStripState(), was added to MediaWiki in
1.34.  This extension already requires MediaWiki >= 1.34.

Bug: T275160
Change-Id: I7806068e1cd6e4da66adfe7bb75095d4bfb5d6bc
2021-02-19 17:08:41 -05:00
Timo Tijhof fa20f69cf4 SyntaxHighlightVisualEditorModule: Use Context::encodeJson() instead
Also fixes the Phan warning about Xml::encodeJsCall/FormatJson
needing booleaen where int inDebugMode() is passed.

Change-Id: Id8de16ab683948eae096b43462118ea837f53038
2021-02-18 05:07:32 +00:00
Kunal Mehta 9bdf728fc1 Move default CSS/JS model mapping out of extension.json
These are for MediaWiki core models, so it seems reasonable to specify
them in the PHP source.

Change-Id: Iab6f9969d2bf72122b2661e139aa21a3475a92a8
2021-02-10 23:54:21 -08:00
Ed Sanders 2d3af74c39 Add mw-content-ltr/rtl classes to inline snippets
We already add the dir=ltr/rtl HTML attribute so this
should be a no-op and makes it consistent with block
snippets.

Change-Id: I53e9204cc3bd54ba167f6f91e718a9d35b5bdfd0
2021-01-15 17:33:15 +00:00
Ed Sanders 16af4f3c6b Register VE module unconditionally
Change-Id: Ifccc0223c2b57c0de6f6c14355850213b090f8fb
2021-01-03 00:25:21 +00:00
Ed Sanders 583e3b3db8 Add support for line anchors on code pages
Bug: T29531
Change-Id: Ic09086c19d37bdff8bb7e68bbb0f676ef87896fe
2021-01-03 00:19:13 +00:00
Ed Sanders e8add72d66 Move all HTML wrapping into #highlight
This means all callers to #highlight get code wrapped
in the correct HTML.

This was done outside of #highlight before as the transformation
depended on $parser, so optionally pass in a $parser object if
the contents are going to go through the parser.

Change-Id: Ic5d5c341687e965804cb33da07dda23913718ff5
2021-01-01 18:27:29 +00:00
Ed Sanders 10ec5067c5 Enable line numbers on code content pages
Bug: T32773
Change-Id: I2eb8dcfe4d7bf751f998e1b2dd26a23cce69bf34
2020-12-30 21:35:05 +00:00