mediawiki-extensions-Cite

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/Cite synced 2024-12-04 19:38:16 +00:00

Author	SHA1	Message	Date
C. Scott Ananian	129b222e97	Ensure CiteParsoidTest registers our Cite implementation These tests pass today because Parsoid is providing an alternative implementation of Cite, but that means this test case isn't actually testing the code in this repo. Bug: T354215 Change-Id: I42521026bab36035ae5eded7c05716234a5a29ea	2024-01-24 20:09:36 +00:00
C. Scott Ananian	234da84418	Hook up Parsoid implementation of Cite This commit also moves certain parser tests involving <ref> from the Parsoid repo to citeParserTests.txt in this repo. Bug: T354215 Change-Id: Ie5b211d2af01a56684473723c68a9ab2775542e3	2024-01-19 11:57:11 -05:00
thiemowmde	9f6dd63ef4	Don't search for [[MediaWiki:cite_link_label_group-]] Such a message shouldn't exist, and doesn't: https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite+link+label+group- Additional notes: * Rename the method to make it more obvious that it's not a cheap getter, but doing something slightly more expensive. * Use more appropriate array_key_exists to check if a cache entry already exists. * Also add a bit more documentation. Bug: T297430 Bug: T353227 Change-Id: Ia5827bbf6fd700b87a749aac17320796428f0688	2024-01-09 17:00:07 +01:00
Adam Wight	f148c65078	Encapsulate ref: pushRef returns an object This patch affects a few methods which use the output of pushRef. Bug: T353451 Change-Id: I10b3fe89406c11cdaede92f18a4b96586ecaf5a0	2024-01-09 10:18:57 +01:00
Adam Wight	262fbe24eb	Encapsulate ref object: limited to ReferenceStack This encapsulation gives us field name, type validation and code documentation. This patch only affects ReferenceStack and continues to return approximately the same array outputs to callers. Some additional information is included and the placeholder column has a new name. Bug: T353451 Change-Id: I405fe7ac241f6991fd4c526bfbb58fbc34f2e147	2024-01-09 09:59:16 +01:00
Adam Wight	1434dc5ca6	Switch to a 1-based "count" The previous patch deprecated the last conditional depending on magic meanings of 0 and -1, so now we're free to let "count" take on a more natural meaning: the number of times a footnote mark appears in article text. Includes a small hack to avoid changing parser output, by artificially decrementing the count by one during rendering. The hack can be removed and test output updated in a separate patch. Bug: T353227 Change-Id: I6f76c50357b274ff97321533e52f435798048268	2024-01-08 11:45:36 +01:00
jenkins-bot	0f4c90cc54	Merge "Store group in ref items"	2024-01-05 11:53:17 +00:00
Adam Wight	fd648aec98	Store group in ref items Encapsulate all information about a ref inside of the internal structure, rather than relying on the container to be organized by group. Bug: T353451 Change-Id: I4c91e8089638b7655bf120402a4a5fcbd1b35452	2024-01-05 11:22:12 +00:00
thiemowmde	b01b420199	Track errors in a status object instead of an array This is another improvement after I7390b68. Status objects are made to keep track of multiple errors. The only difference is: The merge method skips duplicates when the message and all parameters are identical. This causes a minor user-facing change. One of the shortest possible examples is: <references> <ref /> <ref /> </references> This showed two identical, indistinguishable error messages before, but will only show one now. We argue this is fine. The duplicates are confusing and of (almost) no value to the user. In case the information is relevant the correct solution is to make the error messages distinguishable, or introduce a message like "multiple <ref> tags defined in <references> have the same error". This is something for a later patch, if needed. Bug: T353266 Change-Id: I444105462ed24d5ba37b057622b4dc847b40f8d8	2024-01-05 10:49:08 +01:00
thiemowmde	ddda536792	Drop unused cite_reference(s)_link_prefix messages Same as Icfa8215 where we removed the …_suffix messages. This patch is not blocked on anything according to CodeSearch: https://codesearch.wmcloud.org/search/?q=cite_references%3F_link_prefix According to GlobalSearch there are 2 usages we need to talk about: https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.references%3F.link.prefix.* zh.wiktionary replaces "cite_ref-" with "_ref-", and "cite_note-" with "_note-", i.e. they did nothing but remove the word "cite". This happened in 2006, with no explanation. ka.wikibooks and ka.wikiquote replace "cite_note-" with "_შენიშვნა-", which translates back to "_note-". One user did this in 2007, 16 seconds apart. It appears like both are attempts to localize what can be localized, no matter if it's really necessary or not. https://zh.wiktionary.org/wiki/Special:Contributions/Shibo77?offset=20060510 https://ka.wikiquote.org/wiki/Special:Contributions/Trulala?offset=20070219 Note how one user experimented with an "a" in some of the edits to see what effect the change might have, to imediatelly revert it. The modifications don't really have an effect on anything, except on the anchors in the resulting <a href="#_ref-5"> and <sup id="_ref-5"> HTML. It might also be briefly visible in the browser's address bar when such a link is clicked. We can only assume the two users did this to make the URL appear shorter (?). A discussion apparently never happened. Bot users are inactive. Both pieces of HTML are generated in the Cite code. Removing the messages will change all places the same time. All links will continue to work. The only possible effect is that hard-coded weblinks to an individual reference will link to the top of the article instead. But: a) This is extremely unlikely to happen. There is no reason to link to a reference from outside of the article. b) Such links are not guaranteed to work anyway as they can break for a multitude of other reasons, e.g. the <ref> being renamed, removed, or replaced. c) Even if such a link breaks, it still links to the correct article. There is also no on-wiki code on zh.wiktionary that would do anything with the shortened prefix: https://zh.wiktionary.org/w/index.php?search=insource%3A%2F_%28ref%7Cnote%29-%2F&title=Special%3A%E6%90%9C%E7%B4%A2&profile=advanced&fulltext=1&ns2=1&ns4=1&ns8=1&ns10=1&ns12=1&ns828=1&ns2300=1 I argue this is safe to remove, even without contacting the mentioned communities first. Bug: T321217 Change-Id: I160a119710dc35679dbdc2f39ddf453dbd5a5dfa	2024-01-04 13:17:42 +01:00
thiemowmde	ca3203699c	Capitalized dir="RTL" should not trigger any error This fixes a minor issue introduced in I294b59f. Two identical dir="…" with different capitalizations should not be reported as an error. Turns out the implementation in the Cite extension doesn't care about this capitalization at all. That's why I suggest to do the normalization as early as possible. This is slightly different in the Parsoid implementation. Bug: T202593 Change-Id: I96b4a281d6020d61d1f36ec027cf833bbb244f03	2024-01-03 16:30:16 +00:00
Adam Wight	d2b92c5253	Explicit test fixture field names Bug: T353451 Change-Id: I8a308dd2785939da52a698cf5e63bce4bc228b77	2023-12-22 23:52:22 +01:00
Adam Wight	5d1335e279	Explicit parameter names for all test fixtures This is much more readable. Patch changes nothing. Bug: T353451 Change-Id: I72b58881a7329dbe98659553b84e53896ccafc2b	2023-12-21 20:59:25 +01:00
jenkins-bot	9b87bc717d	Merge "Various cleanups to PHPUnit test mock setup"	2023-12-18 12:36:11 +00:00
thiemowmde	742a9ffbf5	Track warnings separately in ReferenceStack Check out how this gets rid of so many "to do" as well as "deprecated" comments. Next qustion: The elements in the stack become more and more complicated. It's probably worth converting them from arrays into first-class objects. But this is for another patch. Bug: T353266 Change-Id: If14acd1070617ca8c4d15be6b1759bd47ead4926	2023-12-15 16:41:04 +01:00
xiplus	f7a181ed42	Give a different error from too_many_keys when 'follow' attribute conflicts Add message "cite_error_ref_follow_conflicts" for tags with conflicting parameters. Bug: T299280 Change-Id: Ie64f4ab4831966f66f812ea67cc244718f818afb	2023-12-15 15:23:53 +01:00
thiemowmde	9304e24551	Various cleanups to PHPUnit test mock setup For example, use convenient upstream methods, and generally make the test setup a bit more readable. Bug: T353227 Change-Id: Ifab71041fcc3f804315793ca7b783f84829c7a0f	2023-12-15 11:45:35 +00:00
thiemowmde	4377f0923d	More simple and consistent @covers and @license tags Same arguments as in Iafa2412. The one reason to use more detailled per-method @covers annotations is to avoid "accidental coverage" where code is marked as being covered by tests that don't assert anything that would be meaningful for this code. This is especially a problem with older, bigger classes with lots of side effects. But all the new classes we introduced over the years are small, with predictable, local effects. That's also why we keep the more detailled @covers annotations for the original Cite class. Bug: T353227 Bug: T353269 Change-Id: I69850f4d740d8ad5a7c2368b9068dc91e47cc797	2023-12-15 12:12:16 +01:00
thiemowmde	d0d5fbbee6	Add temporary ErrorReporter::firstError helper function I hope this makes other refactorings a little easier. Bug: T353266 Change-Id: Ib574d4d54ba2c8bc1310822539336ad71c4309ef	2023-12-14 17:16:49 +01:00
thiemowmde	01dcfbac47	Move Validator tests to a separate class I wanted to make this a unit test but it turns out the Sanitizer::safeEncodeAttribute() calls currently make this impossible. Bug: T353269 Change-Id: I5266e7b8b67db1c812dc9e4675d0c079ab1f9a40	2023-12-14 15:51:26 +00:00
jenkins-bot	6fc8ee7fec	Merge "Get rid of "guarded <references>" terminology"	2023-12-14 14:25:57 +00:00
jenkins-bot	78b40a8c6b	Merge "Extract validation to a separate class"	2023-12-14 14:18:40 +00:00
thiemowmde	c794962df7	Use short fn() syntax in tests where it makes sense We can use this syntax now. It was introduced in PHP 7.4. Bug: T353269 Change-Id: I5404b33b654efb01171fa2b4ad3925170ffd0e56	2023-12-14 08:05:01 +00:00
thiemowmde	12c7ad7504	Get rid of "guarded <references>" terminology This patch only moves existing code around without changing any behavior. What I basically did was merging the old "guardedReferences" method into "references", and then splitting the resulting code in other ways. Now we see a few other concepts emerging. But the idea something would be "guarded" (how?) is gone. The most critical detail in this patch are the new method names, and how the code is split. The names should tell a story, and the methods should do exactly what the name says. Suggestions? Bug: T353266 Change-Id: I8b7921ce24487e9657e4193ea6a2e3e7d7b0b1c3	2023-12-14 08:44:40 +01:00
thiemowmde	a6a0f66130	Extract validation to a separate class This removes almost 200 lines from the main class. This patch intentionally doesn't make any changes to the code but only moves it around. Further improvements are for later patches. Bug: T353269 Change-Id: Ic73f1b7458b3f7b7b89806a88a1111161e3cf094	2023-12-14 07:43:29 +00:00
jenkins-bot	bf53249893	Merge "Move a bit of code out of Cite::guardedReferences"	2023-12-14 02:10:06 +00:00
thiemowmde	689bafdd7f	Use upstream assertStatusError and such in tests The main benefit is that these methods give good debug output in case they fail. Bug: T353266 Change-Id: I0423737240c35c18078863a7eb1d8e4779363973	2023-12-12 19:16:50 +01:00
thiemowmde	9425bb3248	Move a bit of code out of Cite::guardedReferences The main benefit is that the two lines that set and reset $this->inReferencesGroup are now next to each other. More can be done in later patches. Bug: T353266 Change-Id: Ib3f40c40e0b1854f8e5a32af600f28931fffdb8c	2023-12-12 18:06:58 +00:00
jenkins-bot	34798cce42	Merge "Change all tests to use overrideConfigValue"	2023-12-12 08:59:46 +00:00
thiemowmde	44ba7a89e2	Parse error messages as late as possible This moves the actual parsing down to be done much later in the process. This won't make any difference in production but makes it easier to refactor the code further. Note I tried to use a StatusValue object but couldn't because it merges seemingly identical messages, while the plain array is fine with containing duplicates. There is one parser test that covers this. While we could change this it needs discussion and most probably a PM decision. Change-Id: I7390b688a33dace95753470a927bbe4de43ea03a	2023-12-11 18:28:35 +00:00
thiemowmde	696c35f496	Change all tests to use overrideConfigValue Two problems: 1. Manipulating globals directly affects all following tests. They are not independent from each other. This problem can be seen in CiteTest. 2. Some test cases in testValidateRef don't test what you think. For example, the test for a conflicting "extends" + "follow" was not failing because of the conflict but because "extends" was disabled and disallowed. Change-Id: Iaa4e1f3f3222155d59984e577cba3f0b8dec40c3	2023-12-11 12:17:15 +01:00
Umherirrender	c9773965ca	Use namespaced classes Done automatically via script Change-Id: I40d64a194ad420c75dfa85711c53e35586895929	2023-12-10 23:18:51 +01:00
thiemowmde	202c0d3636	Drop unused …_suffix and …_key_with_num messages The three messages cite_reference_link_key_with_num, cite_reference_link_suffix, and cite_references_link_suffix are not used for anything. According to CodeSearch: https://codesearch.wmcloud.org/search/?i=1&q=cite_references?_link_(key\|suffix) According to GlobalSearch: https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.references?.link.(key\|suffix).* For comparison: https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.references?.link.prefix.* They are not meant to be localized, as noted in qqq.json. As many messages in Cite the idea is that individual wikis can customize the generated HTML (!) via such messages. These particular ones apparently have been introduced just because it's technically possible, but never been used for anything. They exist since the very first commit from 2005: https://phabricator.wikimedia.org/rECITb714bf09 Note how these messages aren't even visible anywhere, except in the browser's address bar as part of a #… fragment. This obviously doesn't solve T321217 but helps minimizing the surface. Bug: T321217 Change-Id: Icfa82155e3b02df39bb6e924bc472f6edc565d5f	2023-12-08 09:26:05 +01:00
thiemowmde	5f5e9ec9f0	Re-arrange code in preparation for T298278 This patch makes only sense together with I5a64ac4 where it is split from. See I5a64ac4 for details. The idea is that this patch just re-arranges the code without making any changes to how the code behaves. This leaves a minimal change behind that's much easier to revert, if needed. Bug: T298278 Change-Id: Ie78313b7f3ac1ec7bce5ac7512e60a3bb011480a	2023-12-04 08:29:53 +01:00
thiemowmde	858fdcefd9	Split off separate key normalization function This patch does two things: 1. The "normalization" function was never only doing normalization, but also all the necessary HTML encoding. This is now more visible and split into two separate functions. 2. To make this easier we change the order slightly. Because of this the normalization step must now consider spaces. Before spaces have been converted to underscores by escapeIdForLink. The results are all the exact same as before. This is split from I5a64ac4 to make that easier to review. Bug: T298278 Change-Id: I9435a2ddaa21559e29587c58b7523103141467f7	2023-11-30 09:43:35 +01:00
thiemowmde	5aa6cb0c7b	Replace extremely slow parser test with fast unit tests This parser test is a bit obscure, in my opinion. We added it in I8c4de96 to make sure we don't get thousand separators in most places. We continued reworking the code since then. By now it's effectively impossible to "accidentally" get thousand separators. The problematic methods from the Language class are not even accessible any more from this code. To make the tests more robust we now use createNoOpMock (done via the previous patch) where it matters, specifically for all Language and Parser mocks. This proves the problematic Language methods are never called. Bug: T253743 Bug: T238187 Change-Id: I9bfe1f4decfaf699996da63e19473c2c0d581d9d	2023-07-28 00:32:38 +00:00
thiemowmde	2aa421a021	Replace all Language and Parser mocks with no-op mocks Both Language and Parser are extremely complex classes with hundreds of public methods. We really want to make sure we are not depending on anything unexpected from these classes. If calls are made into these classes we want to know exactly what is called. Doing this also showed that some mocked methods are not even needed. Change-Id: Icdfff6c07be78a47bf7cadb1813a72581a51272a	2023-07-27 10:00:28 +00:00
thiemowmde	25e7aa44dd	No expensive transformations on prefix/suffix messages This is a mistake that exists in this codebase for who knows how long. Cite mis-uses the messaging system a lot for internal things we still want to customize somehow, but are not labels that will ever be shown on the screen. The prefix/suffix messages in this patch are meant to be part of the HTML in id="…" attributes. Prefix/suffix must be a static plain text strings. Using e.g. {{GENDER}} or {{PLURAL}} in these messages is not even possible because there is no $1 parameter to use. Note how all other similar messages already use ->plain(). A few wikis override these messages, but stick to the plain-text convention, as they should: https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.reference.fix This will continue to work. This has minor performance implications. Fetching these messages is faster if we can skip transformations. Bug: T321217 Change-Id: I7969c255fe4ce897e904897081da5f52678721aa	2023-07-20 16:22:46 +02:00
jenkins-bot	5affadae9d	Merge "Add strict types to all class properties"	2023-06-08 10:41:54 +00:00
thiemowmde	5c93bbfd00	Add strict types to all class properties A good bunch of PHPDoc comments is obsolete when we use strict types. Change-Id: Ie0692fae4d96c749e9048f7e7c6931ec97998093	2023-06-05 20:01:13 +02:00
thiemowmde	269f726cff	Remove inline @var type hints that are not needed This is mostly because recent IDEs can understand createMock() quite good. We usually don't add such hints every time we use createMock(). We would have a million of them. ;-) Change-Id: If9e37807a6945c4408d374fc97664cd636020ffd	2023-06-05 16:37:03 +02:00
Umherirrender	66159e5b78	tests: Make PHPUnit data providers static Initally used a new sniff with autofix (T333745) Bug: T332865 Change-Id: Ib86d0fb2d3ea734db46b266faede7b4588fae075	2023-05-20 12:03:41 +02:00
Kosta Harlan	e82b3c8a76	phpunit: Unit tests may not access MW services Bug: T266441 Change-Id: Iab4f2e76daddeb88d018f4ead86f26fc62448e24	2022-07-13 10:10:10 +02:00

43 commits