The idea is to make the code less ambiguous and easier to predict.
We passed the same information around two times in a few places.
Change-Id: I39c7a2962bb70bbe40074986e63b1051d0766ea2
Page property is removed immediately since $wgCiteBookReferencing has
never been enabled in production.
Bug: T239989
Change-Id: I6252fcf1485994244dca40470cc5955e8d4f6917
Also simplify the @covers annotations while we are here. The class
is really simple. There isn't much that can be covered acidentally.
Change-Id: I105f4ea6d6beb119d1557a32b691e9eda1b8085c
PHP classes and test are somewhat copies from the Popups codebase.
Some refactoring was applied. More could be done. Not to sure if
this should happen more in follow ups though.
Could also reduce the complexity of checks on the JS side. Most of
these things can only change on page load. The only dynamic part
left is the anon user setting managed by the Popups extension.
Note, that I needed to add a new PHP config for here although the
other still exists and is needed in the Popups extension. This
will change, when the user settings code also moves.
I guess it's okay for now though. Both settings default to true
and are not overridden in the config repos.
Also needed to add the Gadget extension as phan dependency.
Bug: T362771
Depends-On: Ia028c41f8aaa1c522dfc7c372e1ce51e40933a5e
Change-Id: Ie6e8bc706235724494036c7f0d873f5c996c46e6
The ::setPageProperty() method has some tricky corner cases where the
type of the value determines whether or not the page property will be
sorted. Since sort order for the BOOK_REF_PROPERTY is irrelevant,
use ::setUnsortedPageProperty() to communicate this clearly to the
reader.
Depends-On: Ia94c192c429d0482c58467bed787fd2e0aca052f
Followup-To: Ibfd84b52057baa8e249d321ec9df612efd6a29a6
Change-Id: I399f4895ec8720ff2927c5cd5a09c7af4664ee46
The message was part of the original patch that introduced the group
feature in 2009, see https://phabricator.wikimedia.org/rECIT75004e33.
Notice how there was never a test scenario for this message. A test
was added in 2020 via I07738cc.
The message appears only in a rare edge-case when a group is entirely
unused in the text, and only when the group is not empty. The shortest
possible example is:
<references group=g>
<ref group=g name=a>a</ref>
</references>
Just adding something unrelated like `<ref group=g>x</ref>` to the
text changes the error message. Now the group is "used". But this
notion is confusing to begin with. References can be part of a group,
and we can use references, but we can't use groups as if they are a
separate entity.
A better error message already exists.
Notice how this special error message doesn't appear anywhere in the
Parsoid code path. That was already using the other, more generic
error message.
Bug: T269531
Change-Id: I63f663d76e45e6c3d664f145d9a564ee00ff53cd
I always found the name a little ambiguous. The fact that it outputs
an actual HTML list and not just some "references" – whatever that
means – is relevant, in my opinion.
Change-Id: I0d169455c8d2b42d62da4dccb8376c09fb6902bc
These tests pass today because Parsoid is providing an
alternative implementation of Cite, but that means this
test case isn't actually testing the code in this repo.
Bug: T354215
Change-Id: I42521026bab36035ae5eded7c05716234a5a29ea
This commit also moves certain parser tests involving <ref> from
the Parsoid repo to citeParserTests.txt in this repo.
Bug: T354215
Change-Id: Ie5b211d2af01a56684473723c68a9ab2775542e3
Such a message shouldn't exist, and doesn't:
https://global-search.toolforge.org/?q=.®ex=1&namespaces=8&title=Cite+link+label+group-
Additional notes:
* Rename the method to make it more obvious that it's not a cheap
getter, but doing something slightly more expensive.
* Use more appropriate array_key_exists to check if a cache entry
already exists.
* Also add a bit more documentation.
Bug: T297430
Bug: T353227
Change-Id: Ia5827bbf6fd700b87a749aac17320796428f0688
This encapsulation gives us field name, type validation and code
documentation.
This patch only affects ReferenceStack and continues to return
approximately the same array outputs to callers. Some additional
information is included and the placeholder column has a new name.
Bug: T353451
Change-Id: I405fe7ac241f6991fd4c526bfbb58fbc34f2e147
The previous patch deprecated the last conditional depending on magic
meanings of 0 and -1, so now we're free to let "count" take on a more
natural meaning: the number of times a footnote mark appears in
article text.
Includes a small hack to avoid changing parser output, by
artificially decrementing the count by one during rendering. The
hack can be removed and test output updated in a separate patch.
Bug: T353227
Change-Id: I6f76c50357b274ff97321533e52f435798048268
Encapsulate all information about a ref inside of the internal
structure, rather than relying on the container to be organized by
group.
Bug: T353451
Change-Id: I4c91e8089638b7655bf120402a4a5fcbd1b35452
This is another improvement after I7390b68. Status objects are made
to keep track of multiple errors. The only difference is: The merge
method skips duplicates when the message and all parameters are
identical. This causes a minor user-facing change. One of the
shortest possible examples is:
<references>
<ref />
<ref />
</references>
This showed two identical, indistinguishable error messages before,
but will only show one now. We argue this is fine. The duplicates
are confusing and of (almost) no value to the user. In case the
information is relevant the correct solution is to make the error
messages distinguishable, or introduce a message like "multiple
<ref> tags defined in <references> have the same error". This is
something for a later patch, if needed.
Bug: T353266
Change-Id: I444105462ed24d5ba37b057622b4dc847b40f8d8
Same as Icfa8215 where we removed the …_suffix messages.
This patch is not blocked on anything according to CodeSearch:
https://codesearch.wmcloud.org/search/?q=cite_references%3F_link_prefix
According to GlobalSearch there are 2 usages we need to talk about:
https://global-search.toolforge.org/?q=.®ex=1&namespaces=8&title=Cite.references%3F.link.prefix.*
zh.wiktionary replaces "cite_ref-" with "_ref-", and "cite_note-"
with "_note-", i.e. they did nothing but remove the word "cite". This
happened in 2006, with no explanation.
ka.wikibooks and ka.wikiquote replace "cite_note-" with "_შენიშვნა-",
which translates back to "_note-". One user did this in 2007,
16 seconds apart.
It appears like both are attempts to localize what can be localized,
no matter if it's really necessary or not.
https://zh.wiktionary.org/wiki/Special:Contributions/Shibo77?offset=20060510https://ka.wikiquote.org/wiki/Special:Contributions/Trulala?offset=20070219
Note how one user experimented with an "a" in some of the edits to
see what effect the change might have, to imediatelly revert it.
The modifications don't really have an effect on anything, except on
the anchors in the resulting <a href="#_ref-5"> and <sup id="_ref-5">
HTML. It might also be briefly visible in the browser's address bar
when such a link is clicked. We can only assume the two users did this
to make the URL appear shorter (?). A discussion apparently never
happened. Bot users are inactive.
Both pieces of HTML are generated in the Cite code. Removing the
messages will change all places the same time. All links will
continue to work. The only possible effect is that hard-coded
weblinks to an individual reference will link to the top of the
article instead. But:
a) This is extremely unlikely to happen. There is no reason to link
to a reference from outside of the article.
b) Such links are not guaranteed to work anyway as they can break
for a multitude of other reasons, e.g. the <ref> being renamed,
removed, or replaced.
c) Even if such a link breaks, it still links to the correct article.
There is also no on-wiki code on zh.wiktionary that would do anything
with the shortened prefix:
https://zh.wiktionary.org/w/index.php?search=insource%3A%2F_%28ref%7Cnote%29-%2F&title=Special%3A%E6%90%9C%E7%B4%A2&profile=advanced&fulltext=1&ns2=1&ns4=1&ns8=1&ns10=1&ns12=1&ns828=1&ns2300=1
I argue this is safe to remove, even without contacting the mentioned
communities first.
Bug: T321217
Change-Id: I160a119710dc35679dbdc2f39ddf453dbd5a5dfa
This fixes a minor issue introduced in I294b59f. Two identical
dir="…" with different capitalizations should not be reported as an
error.
Turns out the implementation in the Cite extension doesn't care
about this capitalization at all. That's why I suggest to do the
normalization as early as possible. This is slightly different in
the Parsoid implementation.
Bug: T202593
Change-Id: I96b4a281d6020d61d1f36ec027cf833bbb244f03
Check out how this gets rid of so many "to do" as well as
"deprecated" comments.
Next qustion: The elements in the stack become more and more
complicated. It's probably worth converting them from arrays into
first-class objects. But this is for another patch.
Bug: T353266
Change-Id: If14acd1070617ca8c4d15be6b1759bd47ead4926
For example, use convenient upstream methods, and generally make the
test setup a bit more readable.
Bug: T353227
Change-Id: Ifab71041fcc3f804315793ca7b783f84829c7a0f
Same arguments as in Iafa2412. The one reason to use more detailled
per-method @covers annotations is to avoid "accidental coverage"
where code is marked as being covered by tests that don't assert
anything that would be meaningful for this code. This is especially a
problem with older, bigger classes with lots of side effects.
But all the new classes we introduced over the years are small, with
predictable, local effects.
That's also why we keep the more detailled @covers annotations for
the original Cite class.
Bug: T353227
Bug: T353269
Change-Id: I69850f4d740d8ad5a7c2368b9068dc91e47cc797
I wanted to make this a unit test but it turns out the
Sanitizer::safeEncodeAttribute() calls currently make this
impossible.
Bug: T353269
Change-Id: I5266e7b8b67db1c812dc9e4675d0c079ab1f9a40
This patch only moves existing code around without changing any
behavior. What I basically did was merging the old "guardedReferences"
method into "references", and then splitting the resulting code in
other ways. Now we see a few other concepts emerging. But the idea
something would be "guarded" (how?) is gone.
The most critical detail in this patch are the new method names, and
how the code is split. The names should tell a story, and the methods
should do exactly what the name says. Suggestions?
Bug: T353266
Change-Id: I8b7921ce24487e9657e4193ea6a2e3e7d7b0b1c3
This removes almost 200 lines from the main class.
This patch intentionally doesn't make any changes to the code but
only moves it around. Further improvements are for later patches.
Bug: T353269
Change-Id: Ic73f1b7458b3f7b7b89806a88a1111161e3cf094
The main benefit is that the two lines that set and reset
$this->inReferencesGroup are now next to each other. More can be
done in later patches.
Bug: T353266
Change-Id: Ib3f40c40e0b1854f8e5a32af600f28931fffdb8c
This moves the actual parsing down to be done much later in the
process. This won't make any difference in production but makes it
easier to refactor the code further.
Note I tried to use a StatusValue object but couldn't because it
merges seemingly identical messages, while the plain array is fine
with containing duplicates. There is one parser test that covers
this. While we could change this it needs discussion and most
probably a PM decision.
Change-Id: I7390b688a33dace95753470a927bbe4de43ea03a
Two problems:
1. Manipulating globals directly affects all following tests. They
are not independent from each other. This problem can be seen in
CiteTest.
2. Some test cases in testValidateRef don't test what you think.
For example, the test for a conflicting "extends" + "follow" was not
failing because of the conflict but because "extends" was disabled
and disallowed.
Change-Id: Iaa4e1f3f3222155d59984e577cba3f0b8dec40c3
This patch makes only sense together with I5a64ac4 where it is split
from. See I5a64ac4 for details.
The idea is that this patch just re-arranges the code without making
any changes to how the code behaves. This leaves a minimal change
behind that's much easier to revert, if needed.
Bug: T298278
Change-Id: Ie78313b7f3ac1ec7bce5ac7512e60a3bb011480a
This patch does two things:
1. The "normalization" function was never only doing normalization,
but also all the necessary HTML encoding. This is now more visible
and split into two separate functions.
2. To make this easier we change the order slightly. Because of this
the normalization step must now consider spaces. Before spaces have
been converted to underscores by escapeIdForLink.
The results are all the exact same as before.
This is split from I5a64ac4 to make that easier to review.
Bug: T298278
Change-Id: I9435a2ddaa21559e29587c58b7523103141467f7
This parser test is a bit obscure, in my opinion. We added it in
I8c4de96 to make sure we don't get thousand separators in most
places.
We continued reworking the code since then. By now it's effectively
impossible to "accidentally" get thousand separators. The
problematic methods from the Language class are not even accessible
any more from this code.
To make the tests more robust we now use createNoOpMock (done via
the previous patch) where it matters, specifically for all Language
and Parser mocks. This proves the problematic Language methods are
never called.
Bug: T253743
Bug: T238187
Change-Id: I9bfe1f4decfaf699996da63e19473c2c0d581d9d
Both Language and Parser are extremely complex classes with hundreds
of public methods. We really want to make sure we are not depending
on anything unexpected from these classes. If calls are made into
these classes we want to know exactly what is called.
Doing this also showed that some mocked methods are not even needed.
Change-Id: Icdfff6c07be78a47bf7cadb1813a72581a51272a
This is a mistake that exists in this codebase for who knows how
long.
Cite mis-uses the messaging system a lot for internal things we still
want to customize somehow, but are not labels that will ever be shown
on the screen. The prefix/suffix messages in this patch are meant to be
part of the HTML in id="…" attributes. Prefix/suffix must be a static
plain text strings. Using e.g. {{GENDER}} or {{PLURAL}} in these
messages is not even possible because there is no $1 parameter to use.
Note how all other similar messages already use ->plain().
A few wikis override these messages, but stick to the plain-text
convention, as they should:
https://global-search.toolforge.org/?q=.®ex=1&namespaces=8&title=Cite.*reference.*fix
This will continue to work.
This has minor performance implications. Fetching these messages is
faster if we can skip transformations.
Bug: T321217
Change-Id: I7969c255fe4ce897e904897081da5f52678721aa