This moves the actual parsing down to be done much later in the
process. This won't make any difference in production but makes it
easier to refactor the code further.
Note I tried to use a StatusValue object but couldn't because it
merges seemingly identical messages, while the plain array is fine
with containing duplicates. There is one parser test that covers
this. While we could change this it needs discussion and most
probably a PM decision.
Change-Id: I7390b688a33dace95753470a927bbe4de43ea03a
The "parser marker" placeholders are case-sensitive, e.g. for a tag
that's written like <rEf> the placeholder will also say …-rEf-…. This
was really just a mistake.
The error is as old as this code is. Added in commit 75004e33 in
2009.
Note we shouldn't use /i at the end because the marker itself should
not be case-insensitive. Only the tag name.
Instead of adding more (slow) test cases I update two that are
exactly about this part of Cite (nested tags) anyway.
Bug: T64335
Change-Id: I44c7a42a0da682a1082952fd1af817bf7d45378c
Two problems:
1. Manipulating globals directly affects all following tests. They
are not independent from each other. This problem can be seen in
CiteTest.
2. Some test cases in testValidateRef don't test what you think.
For example, the test for a conflicting "extends" + "follow" was not
failing because of the conflict but because "extends" was disabled
and disallowed.
Change-Id: Iaa4e1f3f3222155d59984e577cba3f0b8dec40c3
This error message really always meant nothing but "there is an
unknown parameter in your <ref> tag". It's unnecessarily confusing
only for historical reasons. See T299280#9384546 for a long
explanation.
Bug: T299280
Change-Id: Ic224d5828f7b7ac0928c44f526c61654ccf3425e
Note how this currently behaves. The user input is
<ref name="… …">
But what we get in the end is
<li id="… …">
This implies that the is decoded and re-encoded with a
slightly different entity encoding. (Note that and  
and   are all the same character.)
Also note how there is only an underscore in the href="…", but the
non-breaking space is gone. This is identical to what happens in
links and headlines. Try for example [[a _a]]. Multiple
underscores, non-breaking spaces, and normal spaces will be
normalized. We just do the same in the id="…" attributes.
Note this fixes only one of the issues listed in T298278.
Bug: T298278
Change-Id: Ia01f2fdd3b3e9ee6aaa9da60ca3386dcd5d6b1a0
This patch makes only sense together with I5a64ac4 where it is split
from. See I5a64ac4 for details.
The idea is that this patch just re-arranges the code without making
any changes to how the code behaves. This leaves a minimal change
behind that's much easier to revert, if needed.
Bug: T298278
Change-Id: Ie78313b7f3ac1ec7bce5ac7512e60a3bb011480a
This patch does two things:
1. The "normalization" function was never only doing normalization,
but also all the necessary HTML encoding. This is now more visible
and split into two separate functions.
2. To make this easier we change the order slightly. Because of this
the normalization step must now consider spaces. Before spaces have
been converted to underscores by escapeIdForLink.
The results are all the exact same as before.
This is split from I5a64ac4 to make that easier to review.
Bug: T298278
Change-Id: I9435a2ddaa21559e29587c58b7523103141467f7
User-options related classes are being moved to
the MediaWiki\User\Options namespace in MediaWiki Core;
reflect that change here.
Bug: T352284
Depends-On: I42653491c19dde5de99e0661770e2c81df5d7e84
Change-Id: I22ff2effcf9b7f2162f5d57608d8ec3651b48dd7
This parser test is a bit obscure, in my opinion. We added it in
I8c4de96 to make sure we don't get thousand separators in most
places.
We continued reworking the code since then. By now it's effectively
impossible to "accidentally" get thousand separators. The
problematic methods from the Language class are not even accessible
any more from this code.
To make the tests more robust we now use createNoOpMock (done via
the previous patch) where it matters, specifically for all Language
and Parser mocks. This proves the problematic Language methods are
never called.
Bug: T253743
Bug: T238187
Change-Id: I9bfe1f4decfaf699996da63e19473c2c0d581d9d
Both Language and Parser are extremely complex classes with hundreds
of public methods. We really want to make sure we are not depending
on anything unexpected from these classes. If calls are made into
these classes we want to know exactly what is called.
Doing this also showed that some mocked methods are not even needed.
Change-Id: Icdfff6c07be78a47bf7cadb1813a72581a51272a
This reverts a very tiny part of Ib3fdc89 from 2 weeks ago. The
reasons are explained in Ib3fdc89. Short version:
* The ->parse() calls have drastic performance implications.
* Allowing wikitext and HTML in this message also makes T321217
worse.
The new message "cite_reference_backlink_symbol" is kept and still
used in the UI. Just not in these two messages any more. This is a
minor redundancy we want to get rid of at some point. But it's not
critical for the moment. This will be done as part of T321217.
Nothing will break on the wikis. Some wikis have customizations for
"cite_references_link_one" and "cite_references_link_many" in place.
This will continue to work as before Ib3fdc89.
Bug: T339973
Change-Id: I933771e3ad67cd530bcf5ee8469cef35ea1070d2
This is a mistake that exists in this codebase for who knows how
long.
Cite mis-uses the messaging system a lot for internal things we still
want to customize somehow, but are not labels that will ever be shown
on the screen. The prefix/suffix messages in this patch are meant to be
part of the HTML in id="…" attributes. Prefix/suffix must be a static
plain text strings. Using e.g. {{GENDER}} or {{PLURAL}} in these
messages is not even possible because there is no $1 parameter to use.
Note how all other similar messages already use ->plain().
A few wikis override these messages, but stick to the plain-text
convention, as they should:
https://global-search.toolforge.org/?q=.®ex=1&namespaces=8&title=Cite.*reference.*fix
This will continue to work.
This has minor performance implications. Fetching these messages is
faster if we can skip transformations.
Bug: T321217
Change-Id: I7969c255fe4ce897e904897081da5f52678721aa
The WikiEditor extension has a button and some help text that
is only applicable if the Cite extension is enabled. Move
that (with some modifications) to the Cite extension instead.
Bug: T339973
Depends-On: I8256660f9c6886d6764b45735284e00308fc56e5
Change-Id: Ib3fdc897dd3330f69c5832003d4c3cb1e6dba2f3
This is mostly because recent IDEs can understand createMock() quite
good. We usually don't add such hints every time we use createMock().
We would have a million of them. ;-)
Change-Id: If9e37807a6945c4408d374fc97664cd636020ffd
html/php sections are added since otherwise it complains that the
"Test lacks html or metadata section on lines"
Change-Id: Ib1c47be09bdbe1e84b595373ad71772f2a983fc9
WebdriverIO has dropped support of sync mode, hence changed to async.
Update npm packages: @wdio/*, wdio-mediawiki
because async mode needs at least @wdio v7.9.
Remove npm packages: @wdio/dot-reporter and @wdio/sync.
Bug: T300196
Change-Id: I8a2ba7f87496b19cc22c347088d52e56741cac71
* Add a file-level comment in the cite tests file.
* Document the CSS rule that hides the Parsoid HTML.
Change-Id: I27dc6d5f6ab09b67e28ce88a2e13bf2d1a13e9c0
* The failing tests added to known failures are the tests
known to fail as documented in T307741.
Bug: T307741
Change-Id: I5e5163a4bd093768d1364516ed79fb2d225ee656
IDEs like my PHPStorm trim spaces from the end of the line. It looks
like they are not relevant for the test and can as well be removed.
Change-Id: I54cb4fdf74dd7174450dcc552b077d388dbac749
The best practice for message keys is to use dashes, not underscores.
This codebase is quite old and traditionally uses underscores. I
think we can make it flexible enough to work with both.
Required for Ie64f4ab.
Change-Id: I6f0584299a4f279ed929784927392eb0f72cbc80
Since parser test requirements are per-file, move the smoke test which
requires `{{#ifeq}}` (from [[mw:Extension:ParserFunctions]]) into its
own file and define the requirement properly in the file header.
That avoids spurious parser test failures if developers don't have
the ParserFunctions extension installed locally.
Change-Id: Ia5ffbe0896d5033fe2da526e42bf111edbc56adf
Extensions using Phan need to be updated simultaneously with core due
to T308443.
Bug: T308718
Depends-On: Id08a220e1d6085e2b33f3f6c9d0e3935a4204659
Change-Id: Iebc5768a3125ce2b173e9b55fc3ea20616553824
If a ResourceLoaderFileModule is constructed with no arguments, it
accesses global variables, so this is not allowed from a unit test.
(This is probably a bug in ResourceLoaderFileModule, but one thing at a
time.)
This blocks If005958c76bbfabba74def4215c48fe94f297797.
Change-Id: I84056024b0d3a9dcddb1ab4dc8596118bb3fe8ea
Note how only the HTML5 mode behavior changes, but nothing in
legacy mode.
Also note this does not 100% fix the issue. The esample with a
non-breaking space is still broken. But it's already much better
than before.
Bug: T298278
Change-Id: Idf50dad4219ff4c594a0cc15f63cb10fdac5ffb7
This is only to document the current status quo and make later patches
smaller and easier to review.
Bug: T298278
Change-Id: I6c78f4d3ee32de596f2b5ee081d56eaffb1cc7bd
* Remove the overhead of serializing and then re-parsing client-side,
instead assign it directly as native object literal.
* Move code for array slicing to PHP.
Change-Id: Iedcc8d57d3bddd3fa32a78b4e7ecc25615d94277
Rather than depending on a separate module with one line of generated
JS code, generate it as a prepended statement to the same module.
Should be a no-op.
Depends-On: I809951d34feb2dbd01b7ae0f4bd98dac7c3f6fe2
Change-Id: I5886bf9f82025048976b7750e8cb751681021fb4
There is the patch(I4297aea3489bb66c98c664da2332584c27793bfa) which will
add DeprecationHelper trait to Parser class in order to deprecate public
Parser::mUser. DeprecationHelper trait has appropriate magic methods
which help to use dynamic properties. In order not to mock them via
createMock(), so getMockBuilder() and onlyMethods() was used.
onlyMethods() method helps to specify methods which need to be mocked.
Now we can use dynamic properties in Parser related tests of Cite
extension.
Bug: T285713
Change-Id: Ie75c9cd66d296ce7cf15432e2093817e18004443
This now aligns with Parsoid commit 88d4620278988d121761fb440952d1d66a70ce99
Required some newline fixes to resync after "Refactor newline logic"
(change I6691c70f8e3fa3f21e2d11035bed9cdc2dc87093 /
commit 6389459b1e) was merged
this morning.
Change-Id: I64fba6cc9330a55d4e1eeb5371164b3eb4efa508
The structure of this class changed a lot in If2fe5f5
(T239572). This was never reflected in the @covers tags. I
suggest to go with a trivial @covers for the entire class
and let PHPUnit figure it out. The alternative is to kind
of repeat many private implementation details, and this
doesn't feel right.
Change-Id: Ie414876489133ab9aca934c19a5e403cd339abf1
This is done to make the discussion in If3dcfd7 easier.
When we introduced this code we actually used it to format
entire numbers. We had to change this later to *not* localize
digits, but only separators. Language::formatNum is and always
was able to do this, so we just continued to use it.
This is discussed now.
It turns out there is only a single place left where we use
formatNum, and it does nothing but localizing the decimal
point. There is another way to do the same.
Bug: T237467
Change-Id: I89b17a9e11b3afc6c653ba7ccc6ff84c37863b66
We broke this feature in December 2019 because it was never covered
by any tests. Full explanation in T245376.
All the features we care about are covered by tests. If all existing
tests succeed, that should be proof enough that this patch does not
introduce any new regression.
Bug: T245376
Change-Id: I1a447884bdc507ac762d212466496b4591c18090
This patch also adds a test case that was missing before. If a
follow="…" is followed by another, normal <ref>, the internal key
(a.k.a. $this->refSequence) is not incremented. This was the case
before, just not covered by any test.
Change-Id: I102d1e67a6918017acc7e4a4663b08c828d101a6
CI already ensures that VisualEditor is loaded alongside Cite, so
the defensive check in the code isn't needed; ext.cite.visualEditor is
defined statically, it's just injected into the page dynamically in the
VisualEditor code handling VisualEditorPluginModules.
Bug: T232875
Change-Id: Ie5e096feca92f9c3ef13c732f3f1ae491e2b7d03
This change does have two effects:
1. Instead of prepending a newline individually in every possible
code path, we do it one time at the end. But only if there is
something in the output. This does not change anything, as proven by
the unchanged parser tests.
2. I removed the newline between the <h2> and the generated
<references> element. Note that both these elements are created in
the same method, next to each other. So there is no way this can
influence other wikitext. Unfortunately this code path is executed
only when using the *preview* function, and impossible to be covered
by parser tests because of this. However, it's covered by unit tests.
This refactoring is motivated by, but not required for T148701.
Bug: T148701
Change-Id: I6691c70f8e3fa3f21e2d11035bed9cdc2dc87093
Previously the reflist was added at the end of the last line of text,
which messes up paragraph wrapping (as seen in many test cases), and
generated invalid HTML when the last line was a list item (T148701).
(second try, previously reverted in 8c933d03c5)
Note this affects only pages where the <references /> tag is missing,
and the references section is auto-generated at the very end of the page.
Bug: T148701
Change-Id: Ib2101346434a4e317b5fc7379215b60c7020cb2b
The most common cleanup required by switching to tidy output was adding
missing <p>-wrappers to the last item before <references/>.
Bug: T246285
Change-Id: I7c8a08c4e6eff7caf4539a26fae475a4133f9a0c
While working on the patch I4303642 I was worried about the line
array_pop( $this->refCallStack )
in the rollback code. Since the patch changed the position of follow
elements in the stack, an array_pop() would pop different elements.
It turns out this is impossible. Rollbacks are only done for <ref>
elements inside a <references> tag, immediatelly after reaching the
closing </references>. It's impossible to use follow="…" inside
<references>. It will not be added to the stack, and therefore not
rolled back.
Even if the edge case would be possible, the *old* code that placed
follow elements on the *other* side of the stack would have been
wrong then.
The test cases in this patch try to hit this edge case, and are
expected to not be able to do so.
Change-Id: I4380bf443db17c6214dbfa2cbda62b46db04258a
Previously the reflist was added at the end of the last line of text,
which messes up paragraph wrapping (as seen in many test cases), and
generated invalid HTML when the last line was a list item (T148701).
Bug: T148701
Change-Id: Ifc873fc913e717026d80d54b570c594d1073fb42
This removes a few tiny pieces of code, and a large chunk related to
incomplete follow="…" attributes (see T240858). It turns out we don't
need to insert elements at the top of the ReferenceStack::$refs
array, because this array is reordered anyway in
ReferencesFormatter::formatRefsList()!
Incomplete follow refs don't have a number, and are ordered to the top
because of this, as before. This doesn't change with this patch.
Change-Id: I43036420be22feb8f0f287d9ccee2afd317df2a9
The isModuleRegistered() method was introduced a few years ago,
when the load order in ResourceLoader was undergoing a change.
It used to be that hooks like were run first to register modules, and then
wgResourceModules was registered afterwards. This was reversed to disallow
mutating the config at run-time from foreign modules and to allow better
caching and error detection.
It's been several years since then, so this redundant check is no longer
needed. ServiceWiring.php in MW core for ResourceLoader always processes
config and extension.json first before this hook is called.
Bug: T247265
Change-Id: I466f1fa70b8f0e9fe5e8e8df90bb0001b3329b87
Html5 fragment mode now bans whitespace per html5 spec
See Ie2b7c9429691e2c491c3359d5b400d8f078aa789
Change-Id: Ie6fa40798f06a358f6082110b4d8cc0028c80321
This reverts parts of the revert I3bee35f, which reverted a3d312c8.
I believe it's helpful to keep these test cases just to document how
the code currently behaves. I removed all TODO because we don't know
if and how we want to touch this again.
Bug: T240858
Change-Id: Ib91acfcb7292e5c03ce9cc4d7be782085e10aa27
This patch is mostly moving code around without changing the behavior.
Exceptions:
* The ErrorReporter creates a <span> container. This was previously
parsed. The only benefit might be error checking and escaping. Rather
pointless. The code just created this HTML. With this patch, it is not
parsed any more. The unit test reflects this change. The output in
production will not change, as the parser tests show.
* Parsing of the message key (to detect it's type and id) is simplified
a lot, using explode. With this the code can, in theory, support more
types.
Bug: T239572
Change-Id: If2fe5f55db46dfc7e0ce445348608bef00bec64e
Perform the validation in validateRef, and display a new error message for
broken "follow" refs. This changes existing behavior, where broken folow
ref content is arbitrarily displayed at the top of the references list and
no error is rendered.
Thanks to weasely wording, the new error can later be reused for "extends"
errors.
Bug: T240858
Change-Id: I506e4dcd1151671f5302ecd99581145d979d8124
This exception was introduced very late in the patch I38c9929. It
already caused trouble. This here is essentially a revert. It restores
the previous behavior where this edge-case was silently ignored. The
worst thing that can happen is that appendText() creates an incomplete
entry in the $this->refs array, which will be rendered at the end. The
user can see it then.
As of now we are not aware of a code path where this would even be
possible. Still this does make the code *more* robust by not making it
explode, but give the user something they can work with.
Bug: T243221
Change-Id: I2e2d29bbd557090981903fcc2ece8796fafa4aa4
These create bogus output, depending on the surrounding wikitext the
<ref> tag is used in. For example, this example wikitext:
* Example.<ref name="1">a</ref> More text.
… will be rendered with the "More text" sentence wrapped on the next
line, outside of the list. However, this does *not* happen in many of
the localizations, e.g. German, because many Tanslatewiki translators
did not copied the bogus \n. Why should they.
TL;DR: These newline characters either do nothing, or destroy the output.
In both cases the proper fix is to replace them with spaces.
Some of the test cases touched in this patch demonstrate the issue.
Change-Id: I395a40637a5293eda1f477963d252ce1a215f8b2
This resolves another TODO. Since this is an intentional limitation in
the design of the feature, I find it pretty signigicant to give it it's
own error message.
Note that the text does not need to be perfect, just good enough for now.
We will review all error messages later via T238188.
Bug: T242141
Change-Id: Id9c863061e855350320131e81f6702c8810736f4
… if possible. In most cases it's possible to use the real object, and
reach into it's private parts via TestingAccessWrapper. This is almost
the same as using a mock, but I feel it's much more "light-weight".
The main change is that there is no strict assertion any more for the
number of ReferenceStack::pushInvalidRef() calls. Before this was mixed
into the same array as the valid references, as elements set to "false".
I think the test is as valueable as before without this extra check. If
the rollback stack works or not is already covered by other tests.
Change-Id: I90213557b164b3e43233a3dc393ee3f3d3d556a9
"Conflicting" here includes the case where one of two <ref> with the
same name does not have an extends attribute. The first occurence of
a name specifies if a <ref> is a top-level or a sub-reference. This can
not be changed later.
This patch changes multiple existing test cases. I checked all of them
in detail and confirmed the behavior is fine. The error reporting is
better or at least equally good in all cases.
Bug: T242141
Change-Id: Iaec306eefe5b168d496990105e297ca044a5e721
Allow a ref with `name=""` for backwards-compatibility.
Partially reverts I07738cce2641026dfaa92ba263ed6f9834be0944
Bug: T242437
Change-Id: Iaed2d1c41be377a4961aff39838b0965f6c00616
The difference between the two is that isOK() only reports "fatals",
while isGood() also reports "warnings" and "errors". I believe we
*want* to report all of these the same way.
Change-Id: I3be832c5db7aba3c03bd2ad8cfbba42362c093fd