Commit graph

382 commits

Author SHA1 Message Date
jenkins-bot 6fc8ee7fec Merge "Get rid of "guarded <references>" terminology" 2023-12-14 14:25:57 +00:00
jenkins-bot 78b40a8c6b Merge "Extract validation to a separate class" 2023-12-14 14:18:40 +00:00
thiemowmde c794962df7 Use short fn() syntax in tests where it makes sense
We can use this syntax now. It was introduced in PHP 7.4.

Bug: T353269
Change-Id: I5404b33b654efb01171fa2b4ad3925170ffd0e56
2023-12-14 08:05:01 +00:00
jenkins-bot 5b4e869014 Merge "tests: Widen @covers annotations" 2023-12-14 07:57:46 +00:00
thiemowmde 12c7ad7504 Get rid of "guarded <references>" terminology
This patch only moves existing code around without changing any
behavior. What I basically did was merging the old "guardedReferences"
method into "references", and then splitting the resulting code in
other ways. Now we see a few other concepts emerging. But the idea
something would be "guarded" (how?) is gone.

The most critical detail in this patch are the new method names, and
how the code is split. The names should tell a story, and the methods
should do exactly what the name says. Suggestions?

Bug: T353266
Change-Id: I8b7921ce24487e9657e4193ea6a2e3e7d7b0b1c3
2023-12-14 08:44:40 +01:00
thiemowmde a6a0f66130 Extract validation to a separate class
This removes almost 200 lines from the main class.

This patch intentionally doesn't make any changes to the code but
only moves it around. Further improvements are for later patches.

Bug: T353269
Change-Id: Ic73f1b7458b3f7b7b89806a88a1111161e3cf094
2023-12-14 07:43:29 +00:00
jenkins-bot bf53249893 Merge "Move a bit of code out of Cite::guardedReferences" 2023-12-14 02:10:06 +00:00
Timo Tijhof 2ff327df53 tests: Widen @covers annotations
> We lose useful coverage and spend valuable time keeping these tags
> accurate through refactors (or worse, forget to do so).
>
> I am not disabling the "only track coverage of specified subject"
> benefits, nor am I claiming coverage in in classes outside the
> subject under test.
>
> Tracking tiny per-method details wastes time in keeping tags
> in sync during refactors, and time to realize (and fix) when people
> inevitably don't keep them in sync, and time lost in finding
> uncovered code to write tests for only to realize it was already
> covered but "not yet claimed".

https://gerrit.wikimedia.org/r/q/owner:Krinkle+is:merged+message:%2522Widen%2522

Change-Id: Iafa241210b81ba1cbfee74e3920fb044c86d09fc
2023-12-14 01:54:48 +00:00
thiemowmde 04208b5fd1 Remove PHPDocs that just repeat what the code already says
We removed a bunch of now redundant docs already, see e.g. Ie0692fa.

Change-Id: I55c62d935bdec8bedaebc2666fca3eb17309b0c7
2023-12-13 12:44:41 +01:00
jenkins-bot c3aa27f2a1 Merge "Avoid a few isset() in favor of more recent syntax" 2023-12-12 23:58:45 +00:00
thiemowmde 689bafdd7f Use upstream assertStatusError and such in tests
The main benefit is that these methods give good debug output in
case they fail.

Bug: T353266
Change-Id: I0423737240c35c18078863a7eb1d8e4779363973
2023-12-12 19:16:50 +01:00
thiemowmde 9425bb3248 Move a bit of code out of Cite::guardedReferences
The main benefit is that the two lines that set and reset
$this->inReferencesGroup are now next to each other. More can be
done in later patches.

Bug: T353266
Change-Id: Ib3f40c40e0b1854f8e5a32af600f28931fffdb8c
2023-12-12 18:06:58 +00:00
jenkins-bot 739e05e151 Merge "build: Update linter libs" 2023-12-12 14:45:22 +00:00
jenkins-bot cdc2bd2b96 Merge "Skip URL encoding in id="…" attributes that aren't URLs" 2023-12-12 14:34:07 +00:00
thiemowmde 89bd26fcf5 Skip URL encoding in id="…" attributes that aren't URLs
I played around with a few options (see patchset 1) but ended
introducing new terminology:

* "Backlink" describes the ↑ button down in the list of <references>
  that jumps back up into the article. The code was already using
  "backlink" in some places.
* "Backlink target" is the id="…" attribute up there, visible as the
  typical [1] in the article.
* I use "jump" to describe the idea that clicking the [1] jumps down
  to the full reference.
* "Jump target" is the id="…" down there in the list of <references>.
* "Jump link" is the same id, but encoded to be used as the href="…"
  attribute when clicking the [1].

I hope this makes sense. Suggestions welcome.

Another benefit is that "normalization" is really only normalization
now, not any URL and/or HTML encoding.

Bug: T298278
Change-Id: I5a64ac43aef895110b61df65b27f683b131886fb
2023-12-12 13:56:37 +00:00
WMDE-Fisch 2a02f5311d build: Update linter libs
* "eslint-config-wikimedia": "0.26.0"
* "grunt-eslint": "24.3.0"
* "grunt-stylelint": "0.19.0"
* "stylelint-config-wikimedia": "0.16.1"

Including auto fixes.

Change-Id: Iadacfc781a48675022144bb8c9489073d0bc19e3
2023-12-12 14:21:07 +01:00
thiemowmde fee8606db6 Avoid a few isset() in favor of more recent syntax
As well as replacing a few `=== null` comparisons with the new ??=
operator.

Bug: T353227
Change-Id: I5b273f452d1e46d37fc28861b54c4e1f19a7a65a
2023-12-12 12:13:42 +00:00
jenkins-bot 34798cce42 Merge "Change all tests to use overrideConfigValue" 2023-12-12 08:59:46 +00:00
jenkins-bot f7294f1b54 Merge "Parse error messages as late as possible" 2023-12-12 08:38:47 +00:00
thiemowmde 44ba7a89e2 Parse error messages as late as possible
This moves the actual parsing down to be done much later in the
process. This won't make any difference in production but makes it
easier to refactor the code further.

Note I tried to use a StatusValue object but couldn't because it
merges seemingly identical messages, while the plain array is fine
with containing duplicates. There is one parser test that covers
this. While we could change this it needs discussion and most
probably a PM decision.

Change-Id: I7390b688a33dace95753470a927bbe4de43ea03a
2023-12-11 18:28:35 +00:00
thiemowmde 6a18eac513 Fix regular expressions not being case-insensitive
The "parser marker" placeholders are case-sensitive, e.g. for a tag
that's written like <rEf> the placeholder will also say …-rEf-…. This
was really just a mistake.

The error is as old as this code is. Added in commit 75004e33 in
2009.

Note we shouldn't use /i at the end because the marker itself should
not be case-insensitive. Only the tag name.

Instead of adding more (slow) test cases I update two that are
exactly about this part of Cite (nested tags) anyway.

Bug: T64335
Change-Id: I44c7a42a0da682a1082952fd1af817bf7d45378c
2023-12-11 19:21:12 +01:00
thiemowmde 696c35f496 Change all tests to use overrideConfigValue
Two problems:

1. Manipulating globals directly affects all following tests. They
are not independent from each other. This problem can be seen in
CiteTest.

2. Some test cases in testValidateRef don't test what you think.
For example, the test for a conflicting "extends" + "follow" was not
failing because of the conflict but because "extends" was disabled
and disallowed.

Change-Id: Iaa4e1f3f3222155d59984e577cba3f0b8dec40c3
2023-12-11 12:17:15 +01:00
Umherirrender c9773965ca Use namespaced classes
Done automatically via script

Change-Id: I40d64a194ad420c75dfa85711c53e35586895929
2023-12-10 23:18:51 +01:00
thiemowmde 202c0d3636 Drop unused …_suffix and …_key_with_num messages
The three messages cite_reference_link_key_with_num,
cite_reference_link_suffix, and cite_references_link_suffix are not
used for anything.

According to CodeSearch:
https://codesearch.wmcloud.org/search/?i=1&q=cite_references?_link_(key|suffix)

According to GlobalSearch:
https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.references?.link.(key|suffix).*
For comparison:
https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.references?.link.prefix.*

They are not meant to be localized, as noted in qqq.json. As many
messages in Cite the idea is that individual wikis can customize the
generated HTML (!) via such messages. These particular ones apparently
have been introduced just because it's technically possible, but never
been used for anything. They exist since the very first commit from
2005: https://phabricator.wikimedia.org/rECITb714bf09

Note how these messages aren't even visible anywhere, except in the
browser's address bar as part of a #… fragment.

This obviously doesn't solve T321217 but helps minimizing the
surface.

Bug: T321217
Change-Id: Icfa82155e3b02df39bb6e924bc472f6edc565d5f
2023-12-08 09:26:05 +01:00
Subramanya Sastry 69529bdcf6 Sync up Cite repo with Parsoid
This now aligns with Parsoid commit 2f962cd9a66c9fd69664e3e8a2d79820cd6f1453

Change-Id: Ia93f8ced5c79e2ba49d40aafe6ea14d1691609b0
2023-12-07 18:46:23 +00:00
thiemowmde 0bae6eb224 Fix confusing wording of "invalid parameter in <ref>" message
This error message really always meant nothing but "there is an
unknown parameter in your <ref> tag". It's unnecessarily confusing
only for historical reasons. See T299280#9384546 for a long
explanation.

Bug: T299280
Change-Id: Ic224d5828f7b7ac0928c44f526c61654ccf3425e
2023-12-07 10:54:46 +01:00
jenkins-bot e73c7d61ca Merge "Correctly encode non-breaking spaces in reference names" 2023-12-06 18:00:43 +00:00
jenkins-bot e66c22beeb Merge "Update tests to match update to <gallery> output in core" 2023-12-06 16:39:06 +00:00
jenkins-bot cb3791b00f Merge "Temporarily disable test to allow us to make changes in core" 2023-12-06 16:17:50 +00:00
jenkins-bot f455dd3f79 Merge "Re-arrange code in preparation for T298278" 2023-12-05 14:53:42 +00:00
thiemowmde f9bb125e4c Correctly encode non-breaking spaces in reference names
Note how this currently behaves. The user input is
<ref name="…&nbsp;…">
But what we get in the end is
<li id="…&#160;…">
This implies that the &nbsp; is decoded and re-encoded with a
slightly different entity encoding. (Note that &nbsp; and &#160;
and &#xa0; are all the same character.)

Also note how there is only an underscore in the href="…", but the
non-breaking space is gone. This is identical to what happens in
links and headlines. Try for example [[a&nbsp;_a]]. Multiple
underscores, non-breaking spaces, and normal spaces will be
normalized. We just do the same in the id="…" attributes.

Note this fixes only one of the issues listed in T298278.

Bug: T298278
Change-Id: Ia01f2fdd3b3e9ee6aaa9da60ca3386dcd5d6b1a0
2023-12-05 07:58:38 +01:00
jenkins-bot 1e5cd295e0 Merge "Split off separate key normalization function" 2023-12-04 12:15:21 +00:00
thiemowmde 5f5e9ec9f0 Re-arrange code in preparation for T298278
This patch makes only sense together with I5a64ac4 where it is split
from. See I5a64ac4 for details.

The idea is that this patch just re-arranges the code without making
any changes to how the code behaves. This leaves a minimal change
behind that's much easier to revert, if needed.

Bug: T298278
Change-Id: Ie78313b7f3ac1ec7bce5ac7512e60a3bb011480a
2023-12-04 08:29:53 +01:00
thiemowmde 858fdcefd9 Split off separate key normalization function
This patch does two things:

1. The "normalization" function was never only doing normalization,
but also all the necessary HTML encoding. This is now more visible
and split into two separate functions.

2. To make this easier we change the order slightly. Because of this
the normalization step must now consider spaces. Before spaces have
been converted to underscores by escapeIdForLink.

The results are all the exact same as before.

This is split from I5a64ac4 to make that easier to review.

Bug: T298278
Change-Id: I9435a2ddaa21559e29587c58b7523103141467f7
2023-11-30 09:43:35 +01:00
gerritbot 3d34307f87 Update StaticUserOptionsLookup's FQN
User-options related classes are being moved to
the MediaWiki\User\Options namespace in MediaWiki Core;
reflect that change here.

Bug: T352284
Depends-On: I42653491c19dde5de99e0661770e2c81df5d7e84
Change-Id: I22ff2effcf9b7f2162f5d57608d8ec3651b48dd7
2023-11-29 17:55:07 +00:00
Subramanya Sastry f267635b48 Update tests to match update to <gallery> output in core
Depends-On: I5039c7ef9e07199c256fd568b4f94714e5831d17
Change-Id: I69776da432eeca134785329d424d310fb506bce6
2023-11-27 18:09:03 -06:00
Subramanya Sastry 4929e015d1 Temporarily disable test to allow us to make changes in core
* Needed by change I5039c7ef9e07199c256fd568b4f94714e5831d17

Change-Id: Ieeb6b98afc74595a928bd141889486acfc9eb346
2023-11-27 18:07:35 -06:00
Daimona Eaytoy c71109b97b Update tests for PHPUnit 9.6
- Avoid withConsecutive()

Bug: T342110
Change-Id: Icb951b461c5e6ea1ced2ab8a183c53265d4a2b4f
2023-11-22 00:15:20 +01:00
Fomafix 96fad3ea99 tests: Use $this->getServiceContainer()
Use
	$this->getServiceContainer()
instead of
	MediaWikiServices::getInstance()
in tests.

Change-Id: I5e56524b85ba6e34cecffba2f24fb44b3f66ec8f
2023-10-26 19:59:14 +00:00
gerritbot 198ae5b21c Replace some moved Title class uses, now MediaWiki\Title\Title
Bug: T321681
Change-Id: Idde5511e075fcec40cad12c7369abec3ee4d096c
2023-08-19 04:13:55 +00:00
Daimona Eaytoy e8da022968 Mark CiteDbTest as using the page table
The test creates the Cite-tracking-category-cite-error page.

Change-Id: I20b2007b943afd69bef8fcce18f382e64d752c57
2023-08-11 17:29:33 +02:00
jenkins-bot 0fc08a2ac7 Merge "Replace extremely slow parser test with fast unit tests" 2023-07-28 01:05:39 +00:00
thiemowmde 5aa6cb0c7b Replace extremely slow parser test with fast unit tests
This parser test is a bit obscure, in my opinion. We added it in
I8c4de96 to make sure we don't get thousand separators in most
places.

We continued reworking the code since then. By now it's effectively
impossible to "accidentally" get thousand separators. The
problematic methods from the Language class are not even accessible
any more from this code.

To make the tests more robust we now use createNoOpMock (done via
the previous patch) where it matters, specifically for all Language
and Parser mocks. This proves the problematic Language methods are
never called.

Bug: T253743
Bug: T238187
Change-Id: I9bfe1f4decfaf699996da63e19473c2c0d581d9d
2023-07-28 00:32:38 +00:00
jenkins-bot 51308066fa Merge "Remove unused private method from CiteDbTest" 2023-07-27 19:02:16 +00:00
jenkins-bot ebd355c067 Merge "Replace all Language and Parser mocks with no-op mocks" 2023-07-27 18:59:53 +00:00
thiemowmde 2aa421a021 Replace all Language and Parser mocks with no-op mocks
Both Language and Parser are extremely complex classes with hundreds
of public methods. We really want to make sure we are not depending
on anything unexpected from these classes. If calls are made into
these classes we want to know exactly what is called.

Doing this also showed that some mocked methods are not even needed.

Change-Id: Icdfff6c07be78a47bf7cadb1813a72581a51272a
2023-07-27 10:00:28 +00:00
thiemowmde e750cc2ed2 Remove unused "HTML message" cite_references_no_link
I believe there is no reason to keep this. The only reason might be
a known wiki that uses this as a feature. But such a wiki doesn't
exist:
https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.*no.*link

Bug: T321217
Change-Id: I128d4383da48f9bdda92f03b212ef3997b3a4802
2023-07-27 11:55:13 +02:00
thiemowmde d6d447e861 Remove unused private method from CiteDbTest
Change-Id: I1c827fd01b65910af24ef1c7b9379d9be8cf6880
2023-07-27 09:52:52 +00:00
jenkins-bot 1732720608 Merge "Remove (revert) expensive parsing of 1-character message" 2023-07-25 20:29:47 +00:00
thiemowmde 08814c8e38 Remove (revert) expensive parsing of 1-character message
This reverts a very tiny part of Ib3fdc89 from 2 weeks ago. The
reasons are explained in Ib3fdc89. Short version:
* The ->parse() calls have drastic performance implications.
* Allowing wikitext and HTML in this message also makes T321217
  worse.

The new message "cite_reference_backlink_symbol" is kept and still
used in the UI. Just not in these two messages any more. This is a
minor redundancy we want to get rid of at some point. But it's not
critical for the moment. This will be done as part of T321217.

Nothing will break on the wikis. Some wikis have customizations for
"cite_references_link_one" and "cite_references_link_many" in place.
This will continue to work as before Ib3fdc89.

Bug: T339973
Change-Id: I933771e3ad67cd530bcf5ee8469cef35ea1070d2
2023-07-21 14:05:31 +02:00