This moves the actual parsing down to be done much later in the
process. This won't make any difference in production but makes it
easier to refactor the code further.
Note I tried to use a StatusValue object but couldn't because it
merges seemingly identical messages, while the plain array is fine
with containing duplicates. There is one parser test that covers
this. While we could change this it needs discussion and most
probably a PM decision.
Change-Id: I7390b688a33dace95753470a927bbe4de43ea03a
The "parser marker" placeholders are case-sensitive, e.g. for a tag
that's written like <rEf> the placeholder will also say …-rEf-…. This
was really just a mistake.
The error is as old as this code is. Added in commit 75004e33 in
2009.
Note we shouldn't use /i at the end because the marker itself should
not be case-insensitive. Only the tag name.
Instead of adding more (slow) test cases I update two that are
exactly about this part of Cite (nested tags) anyway.
Bug: T64335
Change-Id: I44c7a42a0da682a1082952fd1af817bf7d45378c
Two problems:
1. Manipulating globals directly affects all following tests. They
are not independent from each other. This problem can be seen in
CiteTest.
2. Some test cases in testValidateRef don't test what you think.
For example, the test for a conflicting "extends" + "follow" was not
failing because of the conflict but because "extends" was disabled
and disallowed.
Change-Id: Iaa4e1f3f3222155d59984e577cba3f0b8dec40c3
This error message really always meant nothing but "there is an
unknown parameter in your <ref> tag". It's unnecessarily confusing
only for historical reasons. See T299280#9384546 for a long
explanation.
Bug: T299280
Change-Id: Ic224d5828f7b7ac0928c44f526c61654ccf3425e
Note how this currently behaves. The user input is
<ref name="… …">
But what we get in the end is
<li id="… …">
This implies that the is decoded and re-encoded with a
slightly different entity encoding. (Note that and  
and   are all the same character.)
Also note how there is only an underscore in the href="…", but the
non-breaking space is gone. This is identical to what happens in
links and headlines. Try for example [[a _a]]. Multiple
underscores, non-breaking spaces, and normal spaces will be
normalized. We just do the same in the id="…" attributes.
Note this fixes only one of the issues listed in T298278.
Bug: T298278
Change-Id: Ia01f2fdd3b3e9ee6aaa9da60ca3386dcd5d6b1a0
This patch makes only sense together with I5a64ac4 where it is split
from. See I5a64ac4 for details.
The idea is that this patch just re-arranges the code without making
any changes to how the code behaves. This leaves a minimal change
behind that's much easier to revert, if needed.
Bug: T298278
Change-Id: Ie78313b7f3ac1ec7bce5ac7512e60a3bb011480a
This patch does two things:
1. The "normalization" function was never only doing normalization,
but also all the necessary HTML encoding. This is now more visible
and split into two separate functions.
2. To make this easier we change the order slightly. Because of this
the normalization step must now consider spaces. Before spaces have
been converted to underscores by escapeIdForLink.
The results are all the exact same as before.
This is split from I5a64ac4 to make that easier to review.
Bug: T298278
Change-Id: I9435a2ddaa21559e29587c58b7523103141467f7
User-options related classes are being moved to
the MediaWiki\User\Options namespace in MediaWiki Core;
reflect that change here.
Bug: T352284
Depends-On: I42653491c19dde5de99e0661770e2c81df5d7e84
Change-Id: I22ff2effcf9b7f2162f5d57608d8ec3651b48dd7
This parser test is a bit obscure, in my opinion. We added it in
I8c4de96 to make sure we don't get thousand separators in most
places.
We continued reworking the code since then. By now it's effectively
impossible to "accidentally" get thousand separators. The
problematic methods from the Language class are not even accessible
any more from this code.
To make the tests more robust we now use createNoOpMock (done via
the previous patch) where it matters, specifically for all Language
and Parser mocks. This proves the problematic Language methods are
never called.
Bug: T253743
Bug: T238187
Change-Id: I9bfe1f4decfaf699996da63e19473c2c0d581d9d
Both Language and Parser are extremely complex classes with hundreds
of public methods. We really want to make sure we are not depending
on anything unexpected from these classes. If calls are made into
these classes we want to know exactly what is called.
Doing this also showed that some mocked methods are not even needed.
Change-Id: Icdfff6c07be78a47bf7cadb1813a72581a51272a
This reverts a very tiny part of Ib3fdc89 from 2 weeks ago. The
reasons are explained in Ib3fdc89. Short version:
* The ->parse() calls have drastic performance implications.
* Allowing wikitext and HTML in this message also makes T321217
worse.
The new message "cite_reference_backlink_symbol" is kept and still
used in the UI. Just not in these two messages any more. This is a
minor redundancy we want to get rid of at some point. But it's not
critical for the moment. This will be done as part of T321217.
Nothing will break on the wikis. Some wikis have customizations for
"cite_references_link_one" and "cite_references_link_many" in place.
This will continue to work as before Ib3fdc89.
Bug: T339973
Change-Id: I933771e3ad67cd530bcf5ee8469cef35ea1070d2
This is a mistake that exists in this codebase for who knows how
long.
Cite mis-uses the messaging system a lot for internal things we still
want to customize somehow, but are not labels that will ever be shown
on the screen. The prefix/suffix messages in this patch are meant to be
part of the HTML in id="…" attributes. Prefix/suffix must be a static
plain text strings. Using e.g. {{GENDER}} or {{PLURAL}} in these
messages is not even possible because there is no $1 parameter to use.
Note how all other similar messages already use ->plain().
A few wikis override these messages, but stick to the plain-text
convention, as they should:
https://global-search.toolforge.org/?q=.®ex=1&namespaces=8&title=Cite.*reference.*fix
This will continue to work.
This has minor performance implications. Fetching these messages is
faster if we can skip transformations.
Bug: T321217
Change-Id: I7969c255fe4ce897e904897081da5f52678721aa
The WikiEditor extension has a button and some help text that
is only applicable if the Cite extension is enabled. Move
that (with some modifications) to the Cite extension instead.
Bug: T339973
Depends-On: I8256660f9c6886d6764b45735284e00308fc56e5
Change-Id: Ib3fdc897dd3330f69c5832003d4c3cb1e6dba2f3
This is mostly because recent IDEs can understand createMock() quite
good. We usually don't add such hints every time we use createMock().
We would have a million of them. ;-)
Change-Id: If9e37807a6945c4408d374fc97664cd636020ffd
html/php sections are added since otherwise it complains that the
"Test lacks html or metadata section on lines"
Change-Id: Ib1c47be09bdbe1e84b595373ad71772f2a983fc9
WebdriverIO has dropped support of sync mode, hence changed to async.
Update npm packages: @wdio/*, wdio-mediawiki
because async mode needs at least @wdio v7.9.
Remove npm packages: @wdio/dot-reporter and @wdio/sync.
Bug: T300196
Change-Id: I8a2ba7f87496b19cc22c347088d52e56741cac71
* Add a file-level comment in the cite tests file.
* Document the CSS rule that hides the Parsoid HTML.
Change-Id: I27dc6d5f6ab09b67e28ce88a2e13bf2d1a13e9c0
* The failing tests added to known failures are the tests
known to fail as documented in T307741.
Bug: T307741
Change-Id: I5e5163a4bd093768d1364516ed79fb2d225ee656
IDEs like my PHPStorm trim spaces from the end of the line. It looks
like they are not relevant for the test and can as well be removed.
Change-Id: I54cb4fdf74dd7174450dcc552b077d388dbac749
The best practice for message keys is to use dashes, not underscores.
This codebase is quite old and traditionally uses underscores. I
think we can make it flexible enough to work with both.
Required for Ie64f4ab.
Change-Id: I6f0584299a4f279ed929784927392eb0f72cbc80
Since parser test requirements are per-file, move the smoke test which
requires `{{#ifeq}}` (from [[mw:Extension:ParserFunctions]]) into its
own file and define the requirement properly in the file header.
That avoids spurious parser test failures if developers don't have
the ParserFunctions extension installed locally.
Change-Id: Ia5ffbe0896d5033fe2da526e42bf111edbc56adf
Extensions using Phan need to be updated simultaneously with core due
to T308443.
Bug: T308718
Depends-On: Id08a220e1d6085e2b33f3f6c9d0e3935a4204659
Change-Id: Iebc5768a3125ce2b173e9b55fc3ea20616553824
If a ResourceLoaderFileModule is constructed with no arguments, it
accesses global variables, so this is not allowed from a unit test.
(This is probably a bug in ResourceLoaderFileModule, but one thing at a
time.)
This blocks If005958c76bbfabba74def4215c48fe94f297797.
Change-Id: I84056024b0d3a9dcddb1ab4dc8596118bb3fe8ea
Note how only the HTML5 mode behavior changes, but nothing in
legacy mode.
Also note this does not 100% fix the issue. The esample with a
non-breaking space is still broken. But it's already much better
than before.
Bug: T298278
Change-Id: Idf50dad4219ff4c594a0cc15f63cb10fdac5ffb7
This is only to document the current status quo and make later patches
smaller and easier to review.
Bug: T298278
Change-Id: I6c78f4d3ee32de596f2b5ee081d56eaffb1cc7bd
* Remove the overhead of serializing and then re-parsing client-side,
instead assign it directly as native object literal.
* Move code for array slicing to PHP.
Change-Id: Iedcc8d57d3bddd3fa32a78b4e7ecc25615d94277
Rather than depending on a separate module with one line of generated
JS code, generate it as a prepended statement to the same module.
Should be a no-op.
Depends-On: I809951d34feb2dbd01b7ae0f4bd98dac7c3f6fe2
Change-Id: I5886bf9f82025048976b7750e8cb751681021fb4
There is the patch(I4297aea3489bb66c98c664da2332584c27793bfa) which will
add DeprecationHelper trait to Parser class in order to deprecate public
Parser::mUser. DeprecationHelper trait has appropriate magic methods
which help to use dynamic properties. In order not to mock them via
createMock(), so getMockBuilder() and onlyMethods() was used.
onlyMethods() method helps to specify methods which need to be mocked.
Now we can use dynamic properties in Parser related tests of Cite
extension.
Bug: T285713
Change-Id: Ie75c9cd66d296ce7cf15432e2093817e18004443
This now aligns with Parsoid commit 88d4620278988d121761fb440952d1d66a70ce99
Required some newline fixes to resync after "Refactor newline logic"
(change I6691c70f8e3fa3f21e2d11035bed9cdc2dc87093 /
commit 6389459b1e) was merged
this morning.
Change-Id: I64fba6cc9330a55d4e1eeb5371164b3eb4efa508
The structure of this class changed a lot in If2fe5f5
(T239572). This was never reflected in the @covers tags. I
suggest to go with a trivial @covers for the entire class
and let PHPUnit figure it out. The alternative is to kind
of repeat many private implementation details, and this
doesn't feel right.
Change-Id: Ie414876489133ab9aca934c19a5e403cd339abf1
This is done to make the discussion in If3dcfd7 easier.
When we introduced this code we actually used it to format
entire numbers. We had to change this later to *not* localize
digits, but only separators. Language::formatNum is and always
was able to do this, so we just continued to use it.
This is discussed now.
It turns out there is only a single place left where we use
formatNum, and it does nothing but localizing the decimal
point. There is another way to do the same.
Bug: T237467
Change-Id: I89b17a9e11b3afc6c653ba7ccc6ff84c37863b66
We broke this feature in December 2019 because it was never covered
by any tests. Full explanation in T245376.
All the features we care about are covered by tests. If all existing
tests succeed, that should be proof enough that this patch does not
introduce any new regression.
Bug: T245376
Change-Id: I1a447884bdc507ac762d212466496b4591c18090
This patch also adds a test case that was missing before. If a
follow="…" is followed by another, normal <ref>, the internal key
(a.k.a. $this->refSequence) is not incremented. This was the case
before, just not covered by any test.
Change-Id: I102d1e67a6918017acc7e4a4663b08c828d101a6
CI already ensures that VisualEditor is loaded alongside Cite, so
the defensive check in the code isn't needed; ext.cite.visualEditor is
defined statically, it's just injected into the page dynamically in the
VisualEditor code handling VisualEditorPluginModules.
Bug: T232875
Change-Id: Ie5e096feca92f9c3ef13c732f3f1ae491e2b7d03
This change does have two effects:
1. Instead of prepending a newline individually in every possible
code path, we do it one time at the end. But only if there is
something in the output. This does not change anything, as proven by
the unchanged parser tests.
2. I removed the newline between the <h2> and the generated
<references> element. Note that both these elements are created in
the same method, next to each other. So there is no way this can
influence other wikitext. Unfortunately this code path is executed
only when using the *preview* function, and impossible to be covered
by parser tests because of this. However, it's covered by unit tests.
This refactoring is motivated by, but not required for T148701.
Bug: T148701
Change-Id: I6691c70f8e3fa3f21e2d11035bed9cdc2dc87093
Previously the reflist was added at the end of the last line of text,
which messes up paragraph wrapping (as seen in many test cases), and
generated invalid HTML when the last line was a list item (T148701).
(second try, previously reverted in 8c933d03c5)
Note this affects only pages where the <references /> tag is missing,
and the references section is auto-generated at the very end of the page.
Bug: T148701
Change-Id: Ib2101346434a4e317b5fc7379215b60c7020cb2b
The most common cleanup required by switching to tidy output was adding
missing <p>-wrappers to the last item before <references/>.
Bug: T246285
Change-Id: I7c8a08c4e6eff7caf4539a26fae475a4133f9a0c
While working on the patch I4303642 I was worried about the line
array_pop( $this->refCallStack )
in the rollback code. Since the patch changed the position of follow
elements in the stack, an array_pop() would pop different elements.
It turns out this is impossible. Rollbacks are only done for <ref>
elements inside a <references> tag, immediatelly after reaching the
closing </references>. It's impossible to use follow="…" inside
<references>. It will not be added to the stack, and therefore not
rolled back.
Even if the edge case would be possible, the *old* code that placed
follow elements on the *other* side of the stack would have been
wrong then.
The test cases in this patch try to hit this edge case, and are
expected to not be able to do so.
Change-Id: I4380bf443db17c6214dbfa2cbda62b46db04258a
Previously the reflist was added at the end of the last line of text,
which messes up paragraph wrapping (as seen in many test cases), and
generated invalid HTML when the last line was a list item (T148701).
Bug: T148701
Change-Id: Ifc873fc913e717026d80d54b570c594d1073fb42
This removes a few tiny pieces of code, and a large chunk related to
incomplete follow="…" attributes (see T240858). It turns out we don't
need to insert elements at the top of the ReferenceStack::$refs
array, because this array is reordered anyway in
ReferencesFormatter::formatRefsList()!
Incomplete follow refs don't have a number, and are ordered to the top
because of this, as before. This doesn't change with this patch.
Change-Id: I43036420be22feb8f0f287d9ccee2afd317df2a9
The isModuleRegistered() method was introduced a few years ago,
when the load order in ResourceLoader was undergoing a change.
It used to be that hooks like were run first to register modules, and then
wgResourceModules was registered afterwards. This was reversed to disallow
mutating the config at run-time from foreign modules and to allow better
caching and error detection.
It's been several years since then, so this redundant check is no longer
needed. ServiceWiring.php in MW core for ResourceLoader always processes
config and extension.json first before this hook is called.
Bug: T247265
Change-Id: I466f1fa70b8f0e9fe5e8e8df90bb0001b3329b87
Html5 fragment mode now bans whitespace per html5 spec
See Ie2b7c9429691e2c491c3359d5b400d8f078aa789
Change-Id: Ie6fa40798f06a358f6082110b4d8cc0028c80321
This reverts parts of the revert I3bee35f, which reverted a3d312c8.
I believe it's helpful to keep these test cases just to document how
the code currently behaves. I removed all TODO because we don't know
if and how we want to touch this again.
Bug: T240858
Change-Id: Ib91acfcb7292e5c03ce9cc4d7be782085e10aa27
This patch is mostly moving code around without changing the behavior.
Exceptions:
* The ErrorReporter creates a <span> container. This was previously
parsed. The only benefit might be error checking and escaping. Rather
pointless. The code just created this HTML. With this patch, it is not
parsed any more. The unit test reflects this change. The output in
production will not change, as the parser tests show.
* Parsing of the message key (to detect it's type and id) is simplified
a lot, using explode. With this the code can, in theory, support more
types.
Bug: T239572
Change-Id: If2fe5f55db46dfc7e0ce445348608bef00bec64e
Perform the validation in validateRef, and display a new error message for
broken "follow" refs. This changes existing behavior, where broken folow
ref content is arbitrarily displayed at the top of the references list and
no error is rendered.
Thanks to weasely wording, the new error can later be reused for "extends"
errors.
Bug: T240858
Change-Id: I506e4dcd1151671f5302ecd99581145d979d8124
This exception was introduced very late in the patch I38c9929. It
already caused trouble. This here is essentially a revert. It restores
the previous behavior where this edge-case was silently ignored. The
worst thing that can happen is that appendText() creates an incomplete
entry in the $this->refs array, which will be rendered at the end. The
user can see it then.
As of now we are not aware of a code path where this would even be
possible. Still this does make the code *more* robust by not making it
explode, but give the user something they can work with.
Bug: T243221
Change-Id: I2e2d29bbd557090981903fcc2ece8796fafa4aa4
These create bogus output, depending on the surrounding wikitext the
<ref> tag is used in. For example, this example wikitext:
* Example.<ref name="1">a</ref> More text.
… will be rendered with the "More text" sentence wrapped on the next
line, outside of the list. However, this does *not* happen in many of
the localizations, e.g. German, because many Tanslatewiki translators
did not copied the bogus \n. Why should they.
TL;DR: These newline characters either do nothing, or destroy the output.
In both cases the proper fix is to replace them with spaces.
Some of the test cases touched in this patch demonstrate the issue.
Change-Id: I395a40637a5293eda1f477963d252ce1a215f8b2
This resolves another TODO. Since this is an intentional limitation in
the design of the feature, I find it pretty signigicant to give it it's
own error message.
Note that the text does not need to be perfect, just good enough for now.
We will review all error messages later via T238188.
Bug: T242141
Change-Id: Id9c863061e855350320131e81f6702c8810736f4
… if possible. In most cases it's possible to use the real object, and
reach into it's private parts via TestingAccessWrapper. This is almost
the same as using a mock, but I feel it's much more "light-weight".
The main change is that there is no strict assertion any more for the
number of ReferenceStack::pushInvalidRef() calls. Before this was mixed
into the same array as the valid references, as elements set to "false".
I think the test is as valueable as before without this extra check. If
the rollback stack works or not is already covered by other tests.
Change-Id: I90213557b164b3e43233a3dc393ee3f3d3d556a9
"Conflicting" here includes the case where one of two <ref> with the
same name does not have an extends attribute. The first occurence of
a name specifies if a <ref> is a top-level or a sub-reference. This can
not be changed later.
This patch changes multiple existing test cases. I checked all of them
in detail and confirmed the behavior is fine. The error reporting is
better or at least equally good in all cases.
Bug: T242141
Change-Id: Iaec306eefe5b168d496990105e297ca044a5e721
Allow a ref with `name=""` for backwards-compatibility.
Partially reverts I07738cce2641026dfaa92ba263ed6f9834be0944
Bug: T242437
Change-Id: Iaed2d1c41be377a4961aff39838b0965f6c00616
The difference between the two is that isOK() only reports "fatals",
while isGood() also reports "warnings" and "errors". I believe we
*want* to report all of these the same way.
Change-Id: I3be832c5db7aba3c03bd2ad8cfbba42362c093fd
A fun edge case where `name=""` fools both validation branches after
a references rollback, and triggered a LogicException. Stop these
freak refs.
Bug: T242437
Change-Id: I07738cce2641026dfaa92ba263ed6f9834be0944
It's possible to nest <references> by using tricky constructs like the
{{#tag function, and this breaks our rollback logic. Try to show normal
output, otherwise show an error.
Includes regression tests.
Bug: T242437
Change-Id: I33e497cdf8508ce7ccb7f0f315c00af5eee47d0e
Each of these TODOs is something that needs to be fixed or implemented,
so it's helpful to map them to tasks.
Change-Id: I807208392d8a609d7f3b371dc3560a48f3578092
* Always have an empty line between @param and @return to improve
readability as well as consistency within this codebase (before, both
styles have been used).
* Flip parameter order in validateRefInReferences() for consistency with
the rest of the code.
* In Cite::guardedRef() the Parser was now the 1st parameter. I changed
all related functions the same way to make the code less surprising.
* Same in CiteUnitTest. This is really just the @dataProvider. But I feel
it's still helpful to have the arguments in the same order everywhere, if
possible.
* Add a few strict type hints.
* It seems the preferred style for PHP7 return types is `… ) : string {`
with a space before the `:`. There is currently no PHPCS sniff for this.
However, I think this codebase should be consistent, one way or the other.
Change-Id: I91d232be727afd26ff20526ab4ef63aa5ba6bacf
The rollback feature was not able to properly restore a __placeholder__.
That's why a specific use case was behaving different. This already
worked just fine:
<ref extends="a">…</ref>
<references>
<ref name="a">…</ref>
</references>
But this didn't, even if it is the exact same from the users
perspective:
<ref extends="a">…</ref>
{{#tag:references|
<ref name="a">…</ref>
}}
Bug: T239810
Change-Id: I163a1bffb9450a9e7f776e32e66fb08d0452cdb9
Note this leaves *another* bug behind. When a <ref> is properly reused
by name="…", and the content is fine (either missing or identical),
possibly conflicting extends="…" attributes are currently entirely
ignored. However, this is already much better than what happened before.
Bug: T242110
Change-Id: Id808ce31c8036cc290f68bb3e8c5a7b12f4f44cf
This is an extremely relevant use case, but we never had a test for
this:
Some text.<ref extends="book">Page 2</ref>
<references>
<ref name="book">Title of the book</ref>
</references>
What this means: There is no reference in the text that points to the
book as a whole, only references that point to individual pages. The
base <ref> is not used in the text.
This is already properly rendered. There is no "jump back to the text"
link. However, this fails when <references> is wrapped in {{#tag:…}}.
Bug: T239810
Change-Id: Id22db0238266a4fd6131d1a10eb6bf6227552c19
I tried to run these tests with a very old version of this code base
(from 2018) to confirm this is the correct behavior.
Bug: T241303
Change-Id: Id97d016b199458aa178ca732282e9c0e91e291a4
One of the most significant changes is when I noticed that the $group
can never be null. We set it to DEFAULT_GROUP before. That's an empty
string.
I'm not very happy with the two @phan-suppress-next-line. Is there a
better way to fix these lines?
Change-Id: I33c1681e2f3857cb6701da71f4ed8893caff4d1e
I hope this is more readable. This patch does two things: It uses
array keys to name all elements in the data provider. (Note these
array keys don't actually do anything, PHPUnit ignores them.) And this
patch merges two parameters into a single $expectedResult.
Change-Id: Ib7adc32bf8bfd523735591d35d0bcabd3b853cfc
Since I3db5175 the ParserCloned hook handler does not rely on cloning
the Cite object any more. There is no cloning any more. This is dead
code and we could remove it. Just to be sure I propose to keep the
method, but let it throw an exception.
Bug: T240248
Change-Id: I2057ea652ca25f4c7031c28a6e713671738f5e22
These should be impossible conditions, we don't want to continue with
processing.
I hate this patch, it's a temporary workaround until someone rewrites
or replaces the rollback logic, for example with a two-pass parse.
Change-Id: I6a1327e397d4272fa412c3f290c2107d867d2854
I hope this patch is not to horrifying and can be reviewed. It's
possible to split this into a sequence of smaller patches. Please
tell me.
Change-Id: I4797fcd5612fcffb0df6c29ff575dd05f278bd4d
The main benefit is this nifty call: `$this->rollbackRef( ...$call )`
To make this possible, the minimal change I needed to do was to move
the two $argv and $text arguments to the end.
I also tried to order all other arguments as good as I could: Required
first, optional later. Group and name together. Name and extends
together.
All this is private implementation and should not affect anything.
Change-Id: I7af7636c465769aa53122eb40d964eabdd1289ba
I feel this is a little better than before. It looks like we never need
to *replace* a text that existed before.
This depends on I4a156aa which fixes one of the last remaining trimming
issues. Outside of <references>, a <ref> </ref> with no other content
but some whitespace was already forbidden. But not inside of <references>.
This is relevant for appendText(). It should not be called with null, but
was because of the inconsistent behavior.
Change-Id: I38c9929f2fa6e69482e45919e2f8dbf823cb1c8b
Note that this patch changes behavior, an invalid "dir" will result in
a cite reference at the point where the <ref> is declared rather than
in the references section. This is consistent with other errors.
Bug: T15673
Change-Id: Id10db40aa0b391f2f1d9274aa09d22a7278d65e3
The name of the base class in tests is guaranteed to only occur a
single time in a file. There is not much value in making it relative,
and requiring it to appear in the use section. Especially because it
is in the root namespace.
This reflects what I once encoded in the sniff
https://github.com/wmde/WikibaseCodeSniffer/blob/master/Wikibase/Sniffs/Namespaces/FullQualifiedClassNameSniff.php
I wish we could pick this rule and use it in our codebases. But it
seems it is to specific and can't be applied on all codebases, hence
it can't become part of the upstream MediaWiki rule set. At least not
at the moment.
Change-Id: I77c2490c565b7a468c5c944301fc684d20206ec4
This makes one of the last remaining edge-cases about non-empty, but
non-visible content (a <ref> that only contains whitespace) behave
identical to all other places. We already reported it as being empty
everywhere else, except inside of <references>.
Note that the test cases look like they are reporting the same errors
twice. But this is not the case:
The first set of errors is about <ref name="…"> inside of <references>
not having visible content. This should always be reported, even if the
<ref> got content from somewhere else on the page.
The second set of errors is when a <ref name="…"> *never* got any
content.
This patch will slightly increase the numbers of errors reported.
Change-Id: I4a156aa9e466f735d92fe0ba5cc0678ec8bbdd50
* Use the Html class to safely create HTML code.
* $this->referenceStack can not be null any more.
* $this->inReferencesGroup is not needed during output, only when
parsing tags.
* Replace ReferencesStack::getGroupRefs() as well as deleteGroup()
with a combined popGroup() that does both things.
* Extract the code responsible for the "responsive" behavior to a
separate function.
* Some TestingAccessWrapper are not needed.
Change-Id: Ie1cf2533d7417ae2f6647664ff1145e37b814a39
Finishes breaking the circular reference between Cite and Parser.
This patch also demonstrates how evil it is to allow the error reporter
to be called from anywhere, and have side-effects. At least it's explicit
now.
Also fixes a bug where the inner error message would not be in the
interface language.
Bug: T240431
Change-Id: Ic3325cafb503e78295d72231ac6da5c121402def
This begins our journey of breaking the circular reference between
Cite and Parser. In later patches the child objects will also take
Parser as a parameter.
Bug: T240431
Change-Id: Ic672bb4bae19ac5f1e1f5817de171d76b3bd8786
Only create a Cite object if we need one. Never clearState, just
destroy and recreate later.
This makes it less likely that we leak state between parsers, and
saves memory and processing on pages without references.
It's also preparation to decouple Cite logic from state.
Change-Id: I3db517591f4131c23151c76c223af7419cc00ae9
* All classes are in a Cite\ namespace now. No need to repeat the word
"Cite" all over the place.
* The "key formatter" is more an ID or anchor formatter. The strings it
returns are all used in id="…" attributes, as well as in href="#…" links
to jump to these IDs.
* This patch also removes quite a bunch of callbacks from tests that
don't need to be callbacks.
* I'm also replacing all json_encode().
* To make the test code more readable, I shorten a bunch of variable
names to e.g. $msg. The fact they are mocks is still relevant, and still
visible because these variable names are only used in very short scopes.
Change-Id: I2bd7c731efd815bcdc5d33bccb0c8e280d55bd06
We are *so* close to 90%.
This patch should raise the coverage for the CiteDataModule to 100%.
I'm also adding a pure unit test for the clone() behavior. Note the
later is already covered by the CiteDbTest.
Question: Do we want the CiteDbTest to @cover anything?
Change-Id: I40763d01e18991f509bc30b6655aa57b23412fd9
Fixes a bug introduced in Icf61c9a27fd, which would cause a parser
cache split any time the Cite extension was initialized. The
`setLanguage` interface is regrettable, but I'm hoping it will only
be around temporarily.
Converts an integration test into a unit test and completes coverage.
Bug: T239988
Change-Id: I4b1f8909700845c9fa0cbc1a3de50ee7d42f69a5
Tickle very particular edge case in which a recursive parse corrupts
the $parser->extCite object.
Bug: T240248
Change-Id: I70d100e88fa72825194ed9c477b030bbf0b6b486
Because that is what it does. Note our method is different from the one
in the Language class. We only accept strings.
Change-Id: I39107e837cc29f2d7c8867c1e602aa643f9e1a57
This class renders a <references> tag and everything inside. The
previous name sounds like it is responsible for rendering the contents
of a <ref>…</ref> tag. I mean, the class contains a method that does
exactly this. But this method is private.
Change-Id: I1cd06c9a11e0a74104f2874a34efa3e0843a0f70
This adds a test for numbers like "1.2.0" that appear when an extended
reference (e.g. "1.2.") is reused multiple times.
The first separator is from the extended reference. We decided to never
localize it. However, the second seperator is from reusing a reference.
This was always localized. We believe this is a bug, but haven't fixed
it yet.
The test is documenting the status quo "1.2,0" with a comma. This kind
of makes sense, one could argue, because the "1.2" appears like this up
in the text, but the ",0" is a different indicator for a reuse, which
*never* occurs in the text.
Change-Id: Ie3d26bcadd8929b906bfbcac4806af2150d61f2a
This partly reverts Ied2e3f5. I haven't properly tested this before.
Rendering a bad extends (that extends a <ref> that's already extended)
not indented messes the order up and rips other extended <ref>s out of
context.
For now it might be better to stick to the previous, "magic" behavior:
Such an extends behaves like it is extending the *parent*, and is
ordered and indented as such. This is still not correct, but I feel
this is much better than rendering such a bad extends on the top level.
This patch also makes the code fail much earlier for a nested extends,
if this decision can be made already. In this case the error message is
rendered in the middle of the text (as other errors also are), not in
the <references> section.
Change-Id: I33c6a763cd6c11df09d10dfab73f955ed15e9d36
This partly reverts Id7a4036e64920acdeccb4dfcf6bef31d0e5657ab.
The message "cite_section_preview_references" says "Preview of references".
This line is not meant to be part of the content, but an interface message.
It should use the users (interface) language, not the content language.
Change-Id: I1b1b5106266606eb0dfaa31f4abd3cee9ba92e8c
These edge cases are handled correctly already, I just forgot to
remove the TODOs when updating test content.
Note that there's only one TODO left, and it's to forbid a feature which
actually works!
Change-Id: I0d3a1f55f0ce943b0d034dda40e3779fbf241fe4
We never access Language directly, so proxy its method instead of
returning the full object.
I believe I've found a bug, but not fixing here: the footnote body
numeric backlinks like "2.1" behave as if they were decimals rather
than two numbers stuck together with a dot. So they are localized
to "2,1".
Bug: T239725
Change-Id: If386bf96d48cb95c0a287a02bedfe984941efe30
This is a mess of a function, and the tests show it. There are lots
of side-effects and context-sensitivity, which can be addressed in
later work. The interface with ReferenceStack is too wide.
Change-Id: I00cab2a555b2a9efd32d937979cd722d43ac1005
I was able to track this code down to I093d85d from 2012, which was done
right after the ParserAfterParse hook was introduced. I believe the
redundant code path was left to keep the Cite extension compatible with
old MediaWiki versions that did not had this hook yet.
I also noticed this code path is most probably entirely redundant with
the current version of MediaWiki. The *only* thing this code does is
blocking the ParserBeforeTidy hook from doing the same thing a second
time if the ParserAfterParse hook was called before. But it does *not*
block any other compination, e.g. if the two hooks are called the other
way around, or the same hook twice.
In core, it looks like it is impossible for the ParserBeforeTidy hook
being fired without the ParserAfterParse hook being fired before. If this
is true, this is in fact dead code.
Change-Id: Iacf8b600c7abdeaf89c22c2fc31e646f57245e47
Encapsulate the language interfaces, this will be used to replace
global wfMessage calls in future patches.
Change-Id: I7857f3e5154626e0b29977610b81103d91615f65
The new extends="…" feature is using numbers like "1.2". These should be
localized in languages like Hebrew that uses other symbols for the digits.
But the "." should not change.
The existing feature when a <ref> is reused multiple times does have the
same "issue". But it seems this is intentional, because it is covered by
a test. Note this is not visible in German, because German uses custom
labels "a", "b", and so on.
This patch also improves the so called "smoke" tests and makes one cover
numbers up to "1,10" for a <ref> that is reused that often.
Bug: T239725
Change-Id: Iffcb56e1c7be09cefed9dabb1d6391eb6ad995ce
If `extends` is encountered before the parent ref, we reserve the
sequence number and leave a placeholder to record the link between
ref name and number. This is necessary to render a list like,
"[1] [2.1] [2]", or to use subreferencing when the parent ref is
declared in the references tag.
When a placeholder is encountered during references section rendering,
it means that the parent was never declared.
Change-Id: I611cd1d73f775908926a803fae90d039ce122ab6
Pass the full ref structure from ReferenceStack to FootnoteMarkFormatter,
to give it control over the final rendering. This is aligned with how
the FootnoteBodyFormatter directly scans over groupRefs.
Change-Id: I3294fd9366f01daa4250a5d481f4adbae84c72b1
This was carrying the entire footnote marker, but subreferences need
to extract just the first (group ref sequence) part. Storing number
and extendsIndex in two separate fields gives us more flexibility
during rendering, for example these might use two different symbol sets.
Change-Id: I75bd6644c336036f9e84ba91e1c35e05bc1ca7f3
This was a bug which would affect book references, if the same group
and parent ref name combination occur twice in an article.
Change-Id: I608f58aac0cec31c8650835fc80195a87bc851d3
Validation blocks (name==null && text==null), so it should not be a
test case. Give the text a non-null value.
Also adds a check for missing test data.
Change-Id: I0f02206e2221805f5a2f8eaa163ed237cfb8d777
This patch does two things:
* Add strict PHP 7 type hints to most code.
* Narrow the interface of the checkRefsNoReferences() method to not
require a ParserOptions object any more.
Change-Id: I91c6a2d9b76915d7677a3f735ee8e054c898fcc5
There was a call in the API that was *not* using normalizeKey(). Now
that the API is gone, we can inline this.
This patch also contains a bunch of cleanups that might already been
resolved in the previous patches.
Change-Id: Id3767b5830268c8cfe9c10efabfa4a31e9dafeb8
Forked from Icd933fc983.
Bugs and unimplemented features are documented as TODOs in the parser test
fixtures.
Bug: T237241
Change-Id: I9427e025ea0bcf2fa24fd539a775429cc64767cc
This API was never used in Wikimedia production, and would have caused
performance problems. Removing the dead code will simplify our refactoring.
Bug: T238195
Change-Id: I7088f257ec034c0d089e0abdaa5a739910598300
I noticed a possible issue related to the $this->refSequence counter
in the patch Ida9612d. Some of these counters might get messes up, but
there was never a test that checked what will happen to the *next*
reference then.
I checked the test cases in this patch with a very old version of the
codebase.
Change-Id: If6e56f727dce5d0e5e38e048e602437597248a42
We realized the trim() are not needed. This does not leave much behind
in the existing refArg() method, except that it checks for unknown keys.
I tried a few strategies and ended using the pretty new possibility to
have keys in list(), as well as use [] instead of list(). Both is
supported since PHP 7.1.
Change-Id: I569bfa14e68b64402519bd39022c197553881dde
We noticed the group="…" attribute was the only one that was not
trimmed. Does this mean it was possible to have two groups "a" and
" a"? It turns out: no. This was never possible because the parser
already trims all attributes before calling this code.
I tried to come up with the worst possible test case, but it succeeds,
even with very old versions of this codebase.
I suggest to remove the extra trimming from this codebase and rely on
what the parser provides.
Note the content is special and *not* trimmed by default.
Change-Id: Idff015447d7156ba7b5c03a5c423f199a71349f2
These exist two times, one time in the unit/ folder as a unit test, and
another time in the parent folder as an integration test. This confused
me already several times.
Change-Id: I147b8af8a7edba2582496468b4878faecc6d8110
Functional changes:
* hasGroup() will return false when a group exists, but is empty. This
is in line with what other methods like getGroups() already do.
Shouldn't have any effect on the existing code, but feels more clean
and consistent.
* getGroupRefs() won't fail any more when asked for an unknown group.
Tests:
* Add missing @covers for the constructor.
* Simplify test setup by always returning a spy. All tests need it
anyway.
* Cover 3 more methods.
Change-Id: Ie93e9af6258b757d842b30b0b059344733aad434
That was annoying me. Since we're passing a bare list, alphabetical
order helps make the code and tests readable.
Change-Id: I6384094e429e0e2a6fa810fdc28ae0643a0ccf7c
Most of this state is used to manage interactions with other state,
and encapsulation allows us to hide data structures and access behind
self-explanatory function names.
The interface is still much wider than I'd like, but it can be improved in
future work.
There is one small behavior change in here: in the `follows` edge case
demonstrated by I3bdf26fd14, we prepend if the splice point cannot be
used because it has a non-numeric key. I believe this was the original
intention of the logic, and is how the numeric case behaves. I've verified
that when array_splice throws a warning about non-numeric key, it fails to
add anything to the original array, so the broken follows ref disappeared.
Bug: T237241
Change-Id: I091a0b71ee9aa78e841c2e328018e886a7217715
I realized especially the method name html() was wrong. It does not
return HTML. What it returns is still wikitext and must still be parsed.
It only applies some early steps of the parsing process, e.g. expanding
extension <tags>.
Change-Id: I2c403a77eef843940f34f0933e4bfe58e6200ce5
* This fixes the refArg() function. If there is nothing wrong with the
follow="…" attribute, it should not return null.
* However, *everything* is false if an unknown error (e.g. an unknown
attribute) occurs.
* A trivial check for `if ( $follow )` is fine because all keys are
guaranteed to not be the string "0".
Change-Id: Ia4e37781e01db1ee6615ffc30bb68e47023c6634
One of the test cases was duplicated, but a lot of the possible code
paths never had tests, including the happy code path!
I found this issue while trying to rework some of the more confusing
loops in this codebase. These changes are still part of this patch. All
loops still do the same as before, but are (I hope) more readable now.
Bug: T238187
Change-Id: I85baeadd9b149025a14c7522bcc4182339c66972
… and make the error message for bad dir="…" shorter and more to the
point.
Now I understand why the error reporting was not done when $text was
empty: the error was actually appended to $text, which messes with
everything else that also works with the $text variable! This even
includes the API. This error message was exposed via the API. That was
certainly a bug.
With this patch, all error checking for the dir="…" attribute is now
done way down, when rendering the <references> section.
Note this also fixes a bug where the dir="…" was *not* rendered when
previewing a section.
Change-Id: I4ab0cb510973ed879c606bfaa394aacc91129854
This fixes a whole bunch of inconsistencies:
* The dir attribute is now trimmed, as most others already are. This is
an actual user-facing change.
* The internal representation is now false in case the value was invalid,
not an empty string any more.
* Null means the attribute was not present. This is now always used,
even in the return values that are meant to represent an error state. No
existing behavior changes.
* The internal representation does not contain an HTML snippet any more,
but the raw value "ltr" or "rtl", or null. Note this might influence the
API, because the API actually exposes the internal representation.
However, we are pretty sure the API is not used anywhere. Even if,
exposing HTML code was most certainly an unwanted and unexpected effect
of the patch that introduced the dir attribute. This does make this a
bugfix, I would argue.
Change-Id: Ic385d9ab36fa0545c374d3d63063028ae4e449d4
This patch does intentionally not touch any file name. Some of the
file names are a little weird now, e.g. \Cite\Cite. These can more
easily be renamed in later patches.
I used https://codesearch.wmflabs.org/search/?q=new%20Cite%5C( and it
looks like this code is not used anywhere else.
Change-Id: I5f93a224e9cacf45b7a0d68c216a78723364dd96
The use case we care about is this:
<ref extends="some_book"> </ref>
It doesn't make sense that works, but the following doesn't:
<ref extends="some_book"></ref>
We decided that both need to behave the same.
For consistency this patch is applying the same change to all references,
no matter if they use the extends attribute or not. This is an actual
change and might make existing wikitext render differently. However, I
would like to argue that all wikitext that was using this was broken. The
effect of a <ref> </ref> with some whitespace is that the <references>
section at the end of the article will contain – well – an empty footnote.
Bug: T237241
Change-Id: Iaee35583eabcb416b0a06849b89ebbfb0fb7fef9
Note this codebase appears to be dual-licensed. Some files mention MIT,
but extension.json and some other files mention GPL.
Since WMDE typically uses GPL, I will continue to mark the files we
created as such.
Change-Id: I126da10f7fb13a6d4c99e96e72d024b2e5ecee06
The main motivation here is to cover the fallback code that was moved
in I20c814d. At some point we might touch this code again.
Bug: T238194
Change-Id: I0ab8a34b09790f42b10376eb3730c3b3c4ef53d2