Include Pygments 2.0.2 as an executable zip bundle. Also include a script to
automate the process of creating such bundles and to make it reproducible and
verifiable.
Change-Id: I67e6f804e493f065311164c610dc541a5779654e
If Pygments ever adds a dedicated lexer for 'cadlisp', for example, we'd want the
extension to use that, rather than use the compatibility map.
Change-Id: Icc610695ac2826bb526f7c69e867576c660ba6ef
GeSHi is unmaintained, lacks support for many popular modern languages, and
suffers from deep architectural flaws, chief among them the inconsistent
tokenization of different languages, each of which requires a custom
stylesheet.
Pygments is a well-maintained alternative. It is, by my count, the most popular
syntax highlighting library around. It is BSD-licensed, actively maintained,
and is widely used in PHP projects.
To keep this easy to review, this change does not include update for l10n
files, and it does not delete the geshi/ directory. I will do those in a
separate patch.
The chief change between this and the previous implementation is that errors
result in the code block not being highlighted, as opposed to not being printed
at all, having been replaced by an angry red error message. I think that is the
right user experience. If you go to StackOverflow or GitHub and try to mark up
your code block as being written in some language that their highlighter
doesn't know about, you don't get an error message -- the code simply doesn't
get highlighted.
Because we don't recursively load dependencies for extensions, to test this,
you will need to create a composer.local.json in $IP and add:
{
"extra": {
"merge-plugin": {
"include": [
"extensions/SyntaxHighlight_GeSHi/composer.json"
]
}
}
}
Then run `composer update`.
Bug: T85794
Change-Id: I07446ec9893fae3d1e394f435d3d95cf8be6bc33
'Fancy' line numbers are a fairly useless feature, not seen
in any other code highlighter. As the extension doesn't let you
choose a line number mode, default to normal.
Bug: T101602
Change-Id: Iccbd3ba6c91c58b0ea0f0c09832f1422936cd475
Geshi::error() is a method that returns an HTML string representing
the error.
Geshi->$error is an integer error code.
This code clearly means to compare against the integer code given that
GESHI_ERROR_NO_SUCH_LANG is an integer.
Verified by using eval.php to instantiate "new Geshi( '', 'js' );"
and $geshi->error:
> 2
but $geshi->error():
> "<br /><strong>GeSHi Error:</strong> GeSHi could not find the language"
Change-Id: I1ca77733d4b6b5481c5db6aba9f6b7dda6803099
Style modules currently added through addModuleStyles default
to being in the head ("top" position). This is an unhealthy default,
since only critical styles that are needed at pageload should be
in the head. In order to be able to switch the default to "bottom",
existing module positions have to be defined explicitly.
Bug: T97410
Change-Id: Ie120a781ac1950abd7963d6f722aa316b5542b51
Try #2. Our last attempt loaded $wgGeSHiSupportedLanguages late, and
would override anything if it was already set. We still load it late, but
only if it is not already set.
This reverts commit 033ca20746.
Bug: T88063
Change-Id: Iae0806e06a95b2d8932b3d9e078e6135dd6750a3
The API has wrapped its pretty-printed output since Id9cdf102. Apply an
appropriate class to preserve this now that GeSHi is handling it.
Unfortunately GeSHi itself doesn't support adding more than one arbitrary class
to the <pre> (and we're already using that), so we have to add it in a
post-processing step.
Bug: T88742
Change-Id: I38e41db5c341fe06ff825c82d5a9cd4810b7cc24
The version was set from the ExtensionTypes hook (which runs only on
Special:Version). WikiApiary and other API consumers were unable to
detect the version.
This is an amended resubmission of the reverted d69ae1f3ac, in which
the constant was declared twice.
Change include_once to require_once for langs file (follows-up 168e1296).
Bug: T75666
Change-Id: I836e0df942a066d80255c1b68472e7ee58124357
Store the list of supported languages in SyntaxHighlight_GeSHi.langs.php, which
is auto-generated via a maintenance script, updateLanguageList.php.
Change-Id: Ie0be7c42fa6716555c3e03e3f28734d7e0302664
Core change I04b1a3842 adds a hook to allow extensions to
syntax-highlight the pretty-printed output from the API.
Change-Id: If0413a1d922ff8a47afc355e0a2cc276cf54b400
We could do this using TextContent::fillParserOutput(), but alas it is
'protected', so we have to duplicate a tiny bit of code from there.
Bug: 68757
Change-Id: I7d98fa0f97fb195d23caa3d7448a15c3bbe430ca
Follow-up to I7bbdcc0a, see it for details.
Also cleared up the comment describing this here.
Bug: 26204
Change-Id: I103a6d5c3e1f91cf74e244756c2ad318e429a78e
We want to be able to track what styles were added to be able to deliver
this information to MediaWiki's live preview functionality (in order
to solve bug 24134).
This required moving some code in SyntaxHighlight_GeSHi class around.
The old way still works and is used for MediaWiki 1.20 and lower.
Bug: 24134
Change-Id: Iafd91de8922be55688fedef4e43a8e7f54d4e1cc
HTML Tidy doesn't care for tabs, and likes to always output spaces. This
can break the syntax-highlighted output, since Tidy converts tabs to
spaces on the source-code level instead of on the rendered HTML level,
even inside <pre> tags where it really shouldn't (this is probably a bug
in Tidy).
r97300 fixed the bad indenting by converting tabs to spaces before
highlighting, which works around Tidy's bug but breaks highlighting of
languages where tabs are significant (e.g. Whitespace).
It turns out that Tidy's tab mangling occurs while it's reading the
source file, before the conversion of entities such as 	 to tab. So
GeSHi can armor its <pre>-wrapped output against Tidy's bug by encoding
all tabs as 	.
Bug: 30930
Bug: 57826
Change-Id: Id541e2712bd3f94446442ccf2e1e2f214e2801ba
Currently, the SyntaxHighlight GeSHi extension includes geshi/geshi.php.
This creates a conflict when an up to date GeSHI library exists in PHP included path.
This change let the SyntaxHighlight GeSHi include our bundled version instead.
Change-Id: Ie8f9aa6182a38508201d639723e876c156d8b0d1
Using ContentGetParserOutput instead of ShowRawCssJs allows highliting to be applied
for other kinds of scripts as well (e.g. Lua). It also allows more special case code
for CSS and JS to be phased out.
NOTE: this requires Ibfb2cbef to be merged in core!
Change-Id: Ie260c22680ec9a31e505c685d70e17efe8a7bf44
(...modules of unusual size.)
Enabling full syntax highlighting for very long Lua modules can produce DOMs
that have hundreds of thousands of elements and cause browsers to lock up.
I took a count of spans by class (which amounts to a count of tokens by type)
of https://en.wiktionary.org/wiki/Module:languages and came up with:
sy0: 62545 (symbols)
br0: 61952 (brackets)
st0: 39291 (strings)
kw3: 7746 (keywords)
kw1: 3
kw2: 2
co2: 2
co1: 2
nu0: 1
------ ------
Total: 171544
GeSHi allows you to disable highlighting for a particular token type (see
<http://qbnz.com/highlighter/geshi-doc.html#disabling-lexics>) which like a
good way of handling this issue.
Disabling symbols (set_symbols_highlighting(false)) removes both sy0 and br0
elements from the DOM (about 124k elements in the case of Module:languages),
with about 47k elements remaining on the page. This is enough to make Chromium
responsive on my laptop (2.3ghz i5, 8 GB RAM), but it's still noticeably
sluggish. Adding 'set_string_highlighting(false);' removes another 40k elements
from the rendered output, and the resulting DOM is quite zippy at 8k elements.
Proposed solution: disable symbols highlighting when >100 kB; disable strings
highlighting too when >200 kB.
Change-Id: I90c645f9d03bbdc135058a3717a463dec40aa77d
The Geshi rendered output has a font-size of 10px where we would expect
13px just like for <pre>.
Bug 33496 against MediaWiki core dealt with that issue already: in some
(all?) browser 'monospace' has a size of 13px where as the default is
16px. When defining a font-size of 0.8em the monospace is scaled down to
10 px which is too small. By appending another font statement, the
browser treat monospace as a default font and thus scale it starting
with 16px instead of 13px.
This patch append a style to geshi which set the font-family to
"monospace, monospace" thus tricking the browser in considering
monospace a regular font.
Change-Id: I7bbdcc0a21010513473a7ca9d784df77e9920b5b
This can happen if the TitleIsCssOrJsPage hook causes a page without a
.css or .js extension to be considered a CSS/JS page.
Change-Id: I875a7f89f683336f18e70358fe589cef706fd5d1
Remove usage of mw-code-inline:
* That class ended up not being merged into core, so it does nothing.
Only mw-code (for the <pre> wrap) is needed.
7c9b2273c9cbdae90c9f4e3890a13619f769c5d0 (mediawiki/core) had both
in an earlier patch version, but only mw-code was merged.
Follows-up:
* mediawiki/extensions/SyntaxHighlight_GeSHi:
dc147a5ef1
Use .mw-code and .mw-code-inline
* mediawiki/core
7c9b2273c9cbdae90c9f4e3890a13619f769c5d0
Add .mw-code styles in core
Change-Id: I793c05c3e103209cf966d9e35ab37c05528cdbb8
Also make sure mw-content-ltr/rtl is the same as the dir attribute value. Restrict that value to either ltr or rtl (not sure if rtl really needed; source code is always ltr). Also remove text-align:left; as it can/should be set manually in the <source> tag.
Tidy always converts tabs to spaces on input; on a big <pre> section this is ok but it tends to fail on syntax-highlighted output, where the spacing should depend on the *output* not the *input markup*.
As a workaround, when $wgUseTidy is enabled we now apply our own tab-to-space conversion preemptively on the input before feeding it into GeSHi for highlighting; this keeps the right spacing through to output.