Commit graph

5 commits

Author SHA1 Message Date
Ori Livneh 508e926b5d fetchLexers: Pass '--json' to Pygmentize
Use the '--json' flag to get Pygments to output its list of supported
lexers in a machine-readable format. Support for this flag was added (at
our request) to Pygments and included in the 2.11 release[1].

Tested by running updateLexerList.php and confirming empty diff.

  [1]: https://github.com/pygments/pygments/issues/1437

Change-Id: I0f1d7fceca9034e6034bafa6a8dd312b99d379d1
2022-12-20 23:18:55 +00:00
Ryan Schmidt d7a6038211 Fix pygments on Windows
Python on Windows requires the SystemRoot environment variable in order
to initialize its internal RNG, so make sure that is passed along to the
subprocess.

Bug: T300223
Change-Id: I170ce627a3f00c023f4b1f11613f4fe2cb17bd31
2022-11-02 19:02:59 -07:00
Bryan Davis 3bee59df01 fix: Mark Pygmentize::fetchVersion as public, but @internal
Follow up to ae07430. The method needs to be public so that
WANObjectCache can call it from a callback, but we don't expect any
external callers.

Follows-Up: I424926d071e1cfd454a0c2d45a83693f41bdea55
Change-Id: Ia96d3132782435c693d2eaa77fd551fe9590b113
2022-07-15 19:13:52 -06:00
Timo Tijhof ae074306e8 Pygmentize: Cache pygments-version in memc (in addition to APCU)
* Add rationale for each cache key's strategy being in Memc vs APCU.

* Extend pygmentize-lexers from 1 day to 1 week. It rarely changes
  and already varies by version. Few things survive the day, but
  there's not a reason to explicitly expire it sooner I think.

* Add a layer of Memc to the pygments-version APCU cache given that
  it has a short expiry and thus relatively high miss rate.

  The main rationale for this is noise in mwdebug logs since this
  is currently the only thing we log by default in Logstash with prod
  severity (exec INFO) during every pageview (after a php-fpm restart
  which clears APCU). By adding Memc here we lose less of the cache
  churn by reviving it via Memcached, and we keep the sense of there
  being nothing in the logs "by default" at prod severity after restart,
  e.g. don't get used to any fatigue.

  Unlike the other cache keys and hooks, getVersion is the only
  thing that gets called widely regardless of whether syntaxhighlight
  is in use on the given page.

Change-Id: I424926d071e1cfd454a0c2d45a83693f41bdea55
2022-07-12 05:56:16 +00:00
Kunal Mehta af6654e5f9 Port to BoxedCommand
All of the interactions with `pygmentize` have been refactored into a
new class, conviently called Pygmentize. It is responsible for getting

* pygments version (cached in APCu for 1 hour)
* generated CSS (cached in WAN by version for 1 week)
* lexer list (cached in APCu by version for 1 day)

and actually highlighting stuff! Most code paths differentiate whether
we're using a bundled version of pygments or one that has been
explicitly configured. If using the bundled one, we take shortcuts since
we already know the lexer list, have the CSS generated, etc.

ResourceLoaderPygmentsModule is added to switch between loading
generated CSS from the bundled file or Shellboxing out to get it from
pygments.

Bug: T289227
Change-Id: I2e82e5aa2a71604b87ffb4936204201d06678341
2021-09-10 11:47:28 -07:00