Commit graph

4736 commits

Author SHA1 Message Date
C. Scott Ananian 8ef2a4a635 Migrate Parsoid implementation of Cite extension to src/Parsoid
Further commits will be necessary to complete the migration, but
this merge commit imports all of the existing history of the Cite
extension.  It was generated using the following command on a checkout
of Parsoid:

  git filter-repo --path src/Ext/Cite --path src/lib/ext/Cite \
    --path lib/ext/Cite --path lib/ext/Cite.js --path lib/ext.Cite.js \
    --path js/lib/ext.Cite.js --path modules/parser/ext.Cite.js \
    --tag-rename '':'parsoid-' \
    --path-rename src/Ext/Cite:src/Parsoid \
    --path-rename src/lib/ext/Cite:src/Parsoid

And then, in the Cite repository:

  git remote add parsoid ../path/to/parsoid/checkout
  git merge parsoid/master --allow-unrelated-histories

Bug: T354215
Change-Id: I54edd9cf7951ca024c66fe357e8777eed85ab13b
2024-01-17 15:47:33 -05:00
jenkins-bot bfd2cac6a1 Merge "Port Cite web test suite to Cypress" 2024-01-17 13:17:06 +00:00
WMDE-Fisch b59eef1cfe Fix event logging for the reference previews baseline
The current tracking is wrong for several reasons. Mainly because
of a race condition if the Popups extension fininshed loading
before the Cite tracking script is executed. But further more
wgPopupsReferencePreviews was not a good choice to see if the user
sees previews or not.

The logging now uses the monoschema and only checks for enabled
previews when the click events are fired. The chances that Popups
finished initilizing then are much higher then. We still can see
if the init is not finished and the variable not set though.

Also we won't track the overall pageviews in here but use the
generic pageview_hourly from the data lake instead.

Bug: T353798
Depends-On: I1c434f0098ae23bd62256686a658e3d5ef7f70b9
Change-Id: I7a9524274efb58286f520c6148d5463bb0a78dbf
2024-01-17 13:11:17 +01:00
mareikeheuer 0f801ea550 Port Cite web test suite to Cypress
Steps to implement:

 Copy over and adapt setup files, to install Cypress in the Cite code base.
 Port tests/selenium/specs/backlinks.js and supporting file cite.page.js to run under the Cypress environment, in a second patchset.
 Run the new suite in CI, replacing the previous selenium integration.
 Delete the selenium test suite.

Bug: T353436
Change-Id: Ie76371e18d8612daa7c7be741432c6f3e0b783b5
2024-01-17 11:45:04 +01:00
thiemowmde 6c1de9de24 Show warning when dir="…" don't match
Same as I294b59f in the Cite codebase.

An additional, necessary change is that we need to track all dir="…"
values in the ReferencesData object, even if we aren't going to use
the value from a <ref name="…" dir="…" /> reuse without content.
This is the same what's done in the ReferenceStack in the Cite
codebase.

Bug: T202593
Depends-On: I294b59f989f553932b40d08308906dd72d92d2cd
Change-Id: Ida38ae6a41e8550089cf7a37a549080d17943521
2024-01-16 20:59:31 +00:00
jenkins-bot a37b808f56 Merge "Only increment sequence when we really mean it" 2024-01-16 11:23:04 +00:00
Translation updater bot 33f5611e1c Localisation updates from https://translatewiki.net.
Change-Id: If1cd1df28a06c68d35ee4b03e6ad3c9e7b01820a
2024-01-16 08:25:24 +01:00
Translation updater bot 7f71ada8f8 Localisation updates from https://translatewiki.net.
Change-Id: I14dfb64deb13f0113940b6e295fba232f941e2f6
2024-01-15 08:23:57 +01:00
jenkins-bot cf776d8ce0 Merge "Documenting state of subref reuse rollback" 2024-01-10 12:55:20 +00:00
Adam Wight fa4a3c9405 Only increment sequence when we really mean it
I prefer this to having a mix of roll forward/roll back logic.

Change-Id: I9ac7c2ecf302c4d245a8fda7ed57f14cfb26757c
2024-01-10 13:40:41 +01:00
Adam Wight 9d64d837f1 Simplify "follow" block
There was no need to create a new variable here.

Change-Id: I034bc47c07be3ee716c9a1019addb93f0fb2910a
2024-01-10 13:14:06 +01:00
jenkins-bot 04e1dc66f7 Merge "Move extendsCount into parent ref item" 2024-01-10 11:53:11 +00:00
Adam Wight c36d1b90a5 Move extendsCount into parent ref item
Eliminates a complex shared structure in favor of more encapsulation.

Change-Id: I70efabf0ee263ac578472e16dc35047b0601b7ff
2024-01-10 11:55:16 +01:00
jenkins-bot 58b7f2e16f Merge "Test explicitly for parent ref existence" 2024-01-10 10:35:54 +00:00
Translation updater bot 9156975dae Localisation updates from https://translatewiki.net.
Change-Id: I0d9d5f9812c30d615bf8b150204ab5cab2d30858
2024-01-10 08:32:37 +01:00
Adam Wight e4b964eec7 Documenting state of subref reuse rollback
Some interesting stuff is happening, seems to have revealed bugs:
* Rolled-back warnings are still present on the ref
* Subref reuse numbering starts at 0 instead of 1, and formatting is cringe.

But subref rollback does seem to work!

Change-Id: If6321b34d27370553ba85e63dd1e2ae6a3b7c099
2024-01-09 18:05:05 +01:00
Adam Wight 12cd4b979e Test explicitly for parent ref existence
This test was obscured by testing for a field on the parent, but that
would exist if and only if the parent also existed.  Clarify the
guard condition and introduce a named local variable for the parent.

Change-Id: I03079f45cf5ba00d54642c89ac4232a944b2f353
2024-01-09 17:30:10 +01:00
thiemowmde 9f6dd63ef4 Don't search for [[MediaWiki:cite_link_label_group-]]
Such a message shouldn't exist, and doesn't:
https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite+link+label+group-

Additional notes:
* Rename the method to make it more obvious that it's not a cheap
  getter, but doing something slightly more expensive.
* Use more appropriate array_key_exists to check if a cache entry
  already exists.
* Also add a bit more documentation.

Bug: T297430
Bug: T353227
Change-Id: Ia5827bbf6fd700b87a749aac17320796428f0688
2024-01-09 17:00:07 +01:00
Adam Wight 89099e93d6 Rename internal variables
We can be more specific than "value".

Bug: T353451
Change-Id: I8f958a04d7fb6f5a0f10f3c3974b38257ab86f16
2024-01-09 09:53:21 +00:00
Adam Wight 0e01e39061 Encapsulate ref: groupRefs returned as objects
This patch only affects the consumers of groupRefs.

Bug: T353451
Change-Id: I1eff735dbc26dda07aa8ac7af9ea4ddc0906f5a4
2024-01-09 10:22:04 +01:00
Adam Wight f148c65078 Encapsulate ref: pushRef returns an object
This patch affects a few methods which use the output of pushRef.

Bug: T353451
Change-Id: I10b3fe89406c11cdaede92f18a4b96586ecaf5a0
2024-01-09 10:18:57 +01:00
Adam Wight 262fbe24eb Encapsulate ref object: limited to ReferenceStack
This encapsulation gives us field name, type validation and code
documentation.

This patch only affects ReferenceStack and continues to return
approximately the same array outputs to callers.  Some additional
information is included and the placeholder column has a new name.

Bug: T353451
Change-Id: I405fe7ac241f6991fd4c526bfbb58fbc34f2e147
2024-01-09 09:59:16 +01:00
Adam Wight c23b824c34 Reorder conditions
The placeholder field will only be set if the ref exists, so we can
put these in a more logical order.

Change-Id: I2ddfb501fcc3aca936bb45c0d40e4f68c5d2b192
2024-01-09 09:57:03 +01:00
Translation updater bot 137cbb43c0 Localisation updates from https://translatewiki.net.
Change-Id: I55f1d43581b14cef496265c52e4ad1a9e8168754
2024-01-09 08:59:05 +01:00
jenkins-bot e06b2347b2 Merge "Switch to a 1-based "count"" 2024-01-08 11:33:53 +00:00
Adam Wight 1434dc5ca6 Switch to a 1-based "count"
The previous patch deprecated the last conditional depending on magic
meanings of 0 and -1, so now we're free to let "count" take on a more
natural meaning: the number of times a footnote mark appears in
article text.

Includes a small hack to avoid changing parser output, by
artificially decrementing the count by one during rendering.  The
hack can be removed and test output updated in a separate patch.

Bug: T353227
Change-Id: I6f76c50357b274ff97321533e52f435798048268
2024-01-08 11:45:36 +01:00
jenkins-bot e2a771f344 Merge "Localisation updates from https://translatewiki.net." 2024-01-08 10:08:48 +00:00
Translation updater bot 9f52baf53a Localisation updates from https://translatewiki.net.
Change-Id: Ifba3a49ef2b9dbaf3f5ba7ccd0c57c46ea51185c
2024-01-08 09:14:48 +01:00
Adam Wight 86edddc8c2 Use semantic field to test ref type
Stop relying on the magic number distinction between "count" = 0 and -1,
by explicitly testing the "name" field instead.

Bug: T353227
Change-Id: I9dce16b01814e19f508d45b927de570049f0e0f5
2024-01-06 16:46:56 +01:00
Adam Wight 6b0ebb3066 Reduce deeply nested variables
These can be hard to read so this patch introduces named, temporary
variables.

PHP reference assignment is helpful here, and has the nice property
of responding correctly to `isset` as if it were called on the
referenced variable.  However, we're prevented from using this trick
in more places in the code because of an unfortunate side-effect that
PHP will store `null` under the referenced array key.  In some cases
(the ones here), this is harmless because we always test using
`isset` and null behaves the same as an unset value.  In other cases
such as arrays that are iterated over, the spurious key and null
value would be more of a nuisance.

Bug: T353227
Change-Id: Ie43592a2f10677ba19842e92fa29eb4bf3be240c
2024-01-06 16:39:46 +01:00
Arlo Breault fa5100fa92 Sync up Cite repo with Parsoid
This now aligns with Parsoid commit 285e5e390af1c9370203bb3f6111f01fd41d3009

Change-Id: I9311cd580b938d4dabc43f4a659fb49243f22783
2024-01-05 14:30:56 -05:00
jenkins-bot 0071289f79 Merge "Inline constant for "placeholder" key" 2024-01-05 11:53:19 +00:00
jenkins-bot 0f4c90cc54 Merge "Store group in ref items" 2024-01-05 11:53:17 +00:00
jenkins-bot 005f6d9dc6 Merge "More explicit test fixtures: key and count" 2024-01-05 11:53:14 +00:00
jenkins-bot 9b770fba99 Merge "Include more information in missing parent placeholder" 2024-01-05 11:50:06 +00:00
jenkins-bot 1f7d6527a4 Merge "Render list-defined parent without a backlink" 2024-01-05 11:40:37 +00:00
Adam Wight a6cb979d88 Inline constant for "placeholder" key
Minor refactoring of an internal field, which can be treated like the
other columns.

Change-Id: I255578694c5ab9f2ad3cbe232217af3cea60669c
2024-01-05 11:22:30 +00:00
Adam Wight fd648aec98 Store group in ref items
Encapsulate all information about a ref inside of the internal
structure, rather than relying on the container to be organized by
group.

Bug: T353451
Change-Id: I4c91e8089638b7655bf120402a4a5fcbd1b35452
2024-01-05 11:22:12 +00:00
Adam Wight ca6414320f More explicit test fixtures: key and count
These fields get automatic values during normal operation, but we
should make this explicit in tests which meddle with internals.  This
seems to add some clarity, and helps prepare for encapsulation.

Bug: T353451
Change-Id: I8b012a270f16139671f77ea04645d627b2fba87d
2024-01-05 11:21:58 +00:00
Adam Wight 76e6e870d4 Include more information in missing parent placeholder
This allows the subreferences to be collected together under a heading.

Bug: T353451
Change-Id: Ibf28f0baca14de8140c87b03ad4aa86d2f81a20d
2024-01-05 11:21:12 +00:00
Adam Wight 5a69c54900 Render list-defined parent without a backlink
In this case, there was never a ref with this name in the article so
no backlinks should be rendered.

TODO:
* test case with empty parent backlink and LDR parent

Bug: T353451
Change-Id: I8a7abd05a48ce83da3beb92b15e894d53252bd33
2024-01-05 12:07:22 +01:00
thiemowmde b01b420199 Track errors in a status object instead of an array
This is another improvement after I7390b68. Status objects are made
to keep track of multiple errors. The only difference is: The merge
method skips duplicates when the message and all parameters are
identical. This causes a minor user-facing change. One of the
shortest possible examples is:

 <references>
 <ref />
 <ref />
 </references>

This showed two identical, indistinguishable error messages before,
but will only show one now. We argue this is fine. The duplicates
are confusing and of (almost) no value to the user. In case the
information is relevant the correct solution is to make the error
messages distinguishable, or introduce a message like "multiple
<ref> tags defined in <references> have the same error". This is
something for a later patch, if needed.

Bug: T353266
Change-Id: I444105462ed24d5ba37b057622b4dc847b40f8d8
2024-01-05 10:49:08 +01:00
jenkins-bot 23a1a8999d Merge "Remove test for a private method" 2024-01-04 16:30:29 +00:00
Adam Wight ddf5cb2458 Remove test for a private method
Testing internal methods is brittle.  This code path is already
covered by parser test "Valid follow="…" after it's parent"

Bug: T353451
Change-Id: I3b7a4b9962de1f25a7b57f82d80813219d633594
2024-01-04 17:07:36 +01:00
jenkins-bot 4d14f9c701 Merge "Merge two code paths about <references> sections" 2024-01-04 16:06:06 +00:00
jenkins-bot 733824005a Merge "Drop unused cite_reference(s)_link_prefix messages" 2024-01-04 16:04:34 +00:00
jenkins-bot ab20cb3cdf Merge "Rename appendText() to resolveFollow()" 2024-01-04 15:29:44 +00:00
thiemowmde ddda536792 Drop unused cite_reference(s)_link_prefix messages
Same as Icfa8215 where we removed the …_suffix messages.

This patch is not blocked on anything according to CodeSearch:
https://codesearch.wmcloud.org/search/?q=cite_references%3F_link_prefix

According to GlobalSearch there are 2 usages we need to talk about:
https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.references%3F.link.prefix.*

zh.wiktionary replaces "cite_ref-" with "_ref-", and "cite_note-"
with "_note-", i.e. they did nothing but remove the word "cite". This
happened in 2006, with no explanation.

ka.wikibooks and ka.wikiquote replace "cite_note-" with "_შენიშვნა-",
which translates back to "_note-". One user did this in 2007,
16 seconds apart.

It appears like both are attempts to localize what can be localized,
no matter if it's really necessary or not.
https://zh.wiktionary.org/wiki/Special:Contributions/Shibo77?offset=20060510
https://ka.wikiquote.org/wiki/Special:Contributions/Trulala?offset=20070219
Note how one user experimented with an "a" in some of the edits to
see what effect the change might have, to imediatelly revert it.

The modifications don't really have an effect on anything, except on
the anchors in the resulting <a href="#_ref-5"> and <sup id="_ref-5">
HTML. It might also be briefly visible in the browser's address bar
when such a link is clicked. We can only assume the two users did this
to make the URL appear shorter (?). A discussion apparently never
happened. Bot users are inactive.

Both pieces of HTML are generated in the Cite code. Removing the
messages will change all places the same time. All links will
continue to work. The only possible effect is that hard-coded
weblinks to an individual reference will link to the top of the
article instead. But:
a) This is extremely unlikely to happen. There is no reason to link
   to a reference from outside of the article.
b) Such links are not guaranteed to work anyway as they can break
   for a multitude of other reasons, e.g. the <ref> being renamed,
   removed, or replaced.
c) Even if such a link breaks, it still links to the correct article.

There is also no on-wiki code on zh.wiktionary that would do anything
with the shortened prefix:
https://zh.wiktionary.org/w/index.php?search=insource%3A%2F_%28ref%7Cnote%29-%2F&title=Special%3A%E6%90%9C%E7%B4%A2&profile=advanced&fulltext=1&ns2=1&ns4=1&ns8=1&ns10=1&ns12=1&ns828=1&ns2300=1

I argue this is safe to remove, even without contacting the mentioned
communities first.

Bug: T321217
Change-Id: I160a119710dc35679dbdc2f39ddf453dbd5a5dfa
2024-01-04 13:17:42 +01:00
jenkins-bot be755491cc Merge "Capitalized dir="RTL" should not trigger any error" 2024-01-04 11:14:47 +00:00
thiemowmde ca3203699c Capitalized dir="RTL" should not trigger any error
This fixes a minor issue introduced in I294b59f. Two identical
dir="…" with different capitalizations should not be reported as an
error.

Turns out the implementation in the Cite extension doesn't care
about this capitalization at all. That's why I suggest to do the
normalization as early as possible. This is slightly different in
the Parsoid implementation.

Bug: T202593
Change-Id: I96b4a281d6020d61d1f36ec027cf833bbb244f03
2024-01-03 16:30:16 +00:00