ApiVisualEditorEdit: the 'html' parameter should be raw to avoid normalization

Although wikitext is (expected to be) in Unicode Normalization Form C,
the output HTML may not be, due to the presence of explicit entities in
the wikitext representing non-NFC codepoints.

Bug: T266140
Depends-On: I2e78e660ba1867744e34eda7d00ea527ec016b71
Change-Id: I0d34c9a01f1132c2616ed3392ea40d8b73e15325
This commit is contained in:
C. Scott Ananian 2020-12-15 18:58:48 -05:00
parent 094a9aa044
commit aceea5b623

View file

@ -480,7 +480,14 @@ class ApiVisualEditorEdit extends ApiBase {
'minor' => null,
'watchlist' => null,
'html' => [
ParamValidator::PARAM_TYPE => 'text',
// Use the 'raw' type to avoid Unicode NFC normalization.
// This makes the parameter binary safe, so that (a) if
// we use client-side compression it is not mangled, and/or
// (b) deprecated Unicode sequences explicitly encoded in
// wikitext (ie,  ) are not mangled. Wikitext is
// in Unicode Normal Form C, but because of explicit entities
// the output HTML is not guaranteed to be.
ParamValidator::PARAM_TYPE => 'raw',
ParamValidator::PARAM_DEFAULT => null,
],
'etag' => null,