Commit graph

610 commits

Author SHA1 Message Date
jenkins-bot 5082101671 Merge "Do not parse the outer <ol> as wikitext" 2024-06-06 07:57:04 +00:00
thiemowmde 16b0c3c00e Do not parse the outer <ol> as wikitext
With this change the <ol> is added after the inner list of <li> was
parsed as wikitext. In other words, the outer <ol> is now raw HTML
and not sanitized any more. This is fine because it's generated via
code. It doesn't contain any user input.

You might ask how it is possible to parse "invalid" HTML that
contains a sequence of <li> without the outer <ol>. This is fine
because the wikitext parser doesn't care about the nesting structure
of HTML elements. This is done later by Remex (Tidy). But Remex is
never called here. What we care about here is that the wikitext
parser sanitizes the individual HTML elements and their attributes.

The <ol> doesn't need sanitization.

This will make it possible to use reserved data attributes for
T196828. A bazillion unit and parser tests prove that this change
doesn't have any unwanted side-effects.

Bug: T196828
Bug: T239572
Change-Id: I0a9d419f48cad5ddb7251c8fdd2cf9506649436b
2024-06-05 10:15:29 +00:00
jenkins-bot c11922c655 Merge "Add 'mw-cite-backlink' class to Parsoid's HTML" 2024-06-04 16:04:32 +00:00
WMDE-Fisch 2d2efd8259 Move most ReferencePreviews related i18n messages over to Cite
I'm importing the current keys from the Popups extension without
renaming. After that's done keys can be renamed and we can take
care off the fallback solutions.

Bug: T363156
Change-Id: I788c16c5bddc0df7f00dbbc39625b9adaa5bf184
2024-06-04 10:50:05 +02:00
jenkins-bot 477bda1085 Merge "Add 'reference-text' class to Parsoid's HTML" 2024-05-31 21:30:39 +00:00
Subramanya Sastry f38bcde14e Add 'mw-cite-backlink' class to Parsoid's HTML
This patch adds 'mw-cite-backlink' to the linkback span for both
named and unnamed refs. This requires us to add a span wrapper
for the unnamed refs case.

Verified in local testing that this causes aria attributes to be
added to the linkback tags in Parsoid HTML.

This should likely fix other gadgets and code that rely on this
class name to do their work.

Strictly speaking, this is a breaking change since we add an
extra span wrapper for the unnamed ref backlinks which *could*
break anyone using a li > a[rel="mw:referencedBy"] selector.
But, given the specificity of the a[rel] selector, the "li >"
part is unnecessary and might not be used. So, if we wanted to
push our luck (and break process), we could get this in.

Alternatively, we could:
- do this in the the read views OutputTransformPipeline.
- do a real major version bump -- we would be exercising that
  functionality and have to fix and implement any missing pieces
  that may have broke as part of the RESTBase sunsetting.
- not add the span wrapper and fix gadgets to explicitly look for
  both named and unnamed refs with their selectors.

Bug: T328695
Change-Id: Icbd325ebd12cb42186c5b5220dc016835eb18b64
2024-05-31 14:42:04 -05:00
C. Scott Ananian 6256b2fc58 Replace book-referencing page property with tracking category
Page property is removed immediately since $wgCiteBookReferencing has
never been enabled in production.

Bug: T239989
Change-Id: I6252fcf1485994244dca40470cc5955e8d4f6917
2024-05-30 07:50:59 +00:00
Subramanya Sastry 50f01c4f28 Add 'reference-text' class to Parsoid's HTML
This adds the 'reference-text' class where Parsoid added
'mw-reference-text'.

If we don't care about the "mw-" prefix, since there are very
few wiki and code references to 'mw-reference-text', it might
seem like we could update all those references and rip out
'mw-reference-text' from Parsoid output.

But, Parsoid HTML is also exposed via the REST API which means
there are likely many users out there analyzing Parsoid HTML.
https://github.com/search?q=%22mw-reference-text%22+NOT+language%3AHTML&type=code
says there are 512 references to this string - so looks like
we are probably going to rely on a major HTML version bump in
Parsoid in the future and then rip out all the duplicate
classes (mw-ref, mw-references, mw-reference-text OR
reference, references, reference-text).

Bug: T328695
Change-Id: I04b18ac75863a0e3e61bdd47b34508e5547dc872
2024-05-28 12:36:49 -05:00
jenkins-bot 4f8e06b97b Merge "Make it possible to disable the "cite_error" wrapper message" 2024-05-27 08:16:04 +00:00
thiemowmde e4728d7d91 A few tiny code cleanups in the ReferenceStack class
Change-Id: I5feeba61da7e45dad1273cc9eb0b7674fa14e3f0
2024-05-16 14:39:03 +02:00
thiemowmde e3fdee52aa Separate names for server-side vs. client-side feature flags
The two are different:

* CiteReferencePreviews as specified in extension.json is a feature
flag that allows us to disable the feature entirely. It could be
named "CiteReferencePreviewsFeature" or "CiteEnableReferencePreviews",
but renaming a feature flag that's already in use is hard.

* The client-side flag tells the JavaScript code "you know what, it
was kind of a mistake you got loaded, please stop". This is because
we can not make all decisions before we register the ResourceLoader
module, e.g. if the user has certain gadgets enabled.

Adding the word "Active" is not a huge improvement, but at least
makes the two different now. Suggestions welcome.

Bug: T362771
Change-Id: I0f6a911df8772616ac50c1301f402f77dbe32089
2024-04-30 11:57:47 +00:00
WMDE-Fisch f0d7406811 Don't load ReferencePreviews when not enabled in the config
Includes a test and some improvements to the CiteHooks tests.

Change-Id: I0e9108d0d429ed9b7467f993441eefc2557c8e6f
2024-04-30 11:10:22 +02:00
WMDE-Fisch 179d402344 Add ReferencePreviews config checks to Cite extension
PHP classes and test are somewhat copies from the Popups codebase.
Some refactoring was applied. More could be done. Not to sure if
this should happen more in follow ups though.

Could also reduce the complexity of checks on the JS side. Most of
these things can only change on page load. The only dynamic part
left is the anon user setting managed by the Popups extension.

Note, that I needed to add a new PHP config for here although the
other still exists and is needed in the Popups extension. This
will change, when the user settings code also moves.

I guess it's okay for now though. Both settings default to true
and are not overridden in the config repos.

Also needed to add the Gadget extension as phan dependency.

Bug: T362771
Depends-On: Ia028c41f8aaa1c522dfc7c372e1ce51e40933a5e
Change-Id: Ie6e8bc706235724494036c7f0d873f5c996c46e6
2024-04-25 12:50:27 +02:00
jenkins-bot 6ced9418ca Merge "Update ExtensionTagHandler::lintHandler implementations" 2024-04-22 04:55:02 +00:00
Arlo Breault 84a8090248 Update ExtensionTagHandler::lintHandler implementations
I50557e0 changes the ExtensionTagHandler::lintHandler interface so it
can't return a node any more but must return true instead. True
indicates linting has been handled.

Depends-On: I441699e7fe9827a5e06e4638ce88c685deb9b856
Change-Id: I8fe4edc41a840c72cb539bf6f931d45ac777f8a0
2024-04-19 19:59:03 -04:00
jenkins-bot d70af9b1ba Merge "Replcate global variable by MainConfig" 2024-04-18 13:27:06 +00:00
jenkins-bot 583eeafd37 Merge "Use ParserOutput::setUnsortedPageProperty()" 2024-04-18 12:47:56 +00:00
Fomafix f8c9e47118 Replcate global variable by MainConfig
Change-Id: I870f9cdbe150e11adba36ce5b65d247188872e46
2024-04-18 10:30:54 +00:00
C. Scott Ananian 9ff28a0837 Use ParserOutput::setUnsortedPageProperty()
The ::setPageProperty() method has some tricky corner cases where the
type of the value determines whether or not the page property will be
sorted.  Since sort order for the BOOK_REF_PROPERTY is irrelevant,
use ::setUnsortedPageProperty() to communicate this clearly to the
reader.

Depends-On: Ia94c192c429d0482c58467bed787fd2e0aca052f
Followup-To: Ibfd84b52057baa8e249d321ec9df612efd6a29a6
Change-Id: I399f4895ec8720ff2927c5cd5a09c7af4664ee46
2024-04-16 09:51:43 -04:00
C. Scott Ananian fc5f22b32e CiteParserHooksTest: make test compatible with removal of dynamic property
Whether the dynamic property is present or not, it should have a null
value when 'unset' -- and don't use `unset` to delete an *actual*
property when one is present!

Change-Id: Ifcb9492cc5c814d702c6e61e8231abfd8ea0647c
2024-04-12 15:23:14 -04:00
thiemowmde 8b57438abb Various minor code cleanups in the Parsoid PHP code
* Improve PHPDoc documentation.
* Add some missing language-level type declarations.
* Remove unused code.
* Avoid count() when the code doesn't care about the actual count.
* Use the short ??= operator where possible.

Change-Id: I79de49b65d32661b7efa67ecc350276968943e11
2024-04-02 13:08:37 +00:00
jenkins-bot 1fdc65548b Merge "Add strictly typed RefGroupItem class to Parsoid code" 2024-04-01 03:05:52 +00:00
jenkins-bot 51e569d888 Merge "Allow visualeditor-cite-tool-name-… messages to be missing" 2024-03-28 15:27:14 +00:00
thiemowmde 9e1c4645be Add strictly typed RefGroupItem class to Parsoid code
This does the exact same as the previously used generic stdClass
object, just strictly typed. Turned out to be surprisingly
straightforward, as proven by the small size of this patch.

I'm intentionally not adding anything new in this patch. For
example, the new class is perfect to write longer documentation
for every field. But this is for a later patch.

Change-Id: Ibf696f6b5ef1bfdbe846b571fb7e9ded96693351
2024-03-14 11:59:34 +01:00
Jon Robson dcb513eb0e Move reference previews to Cite extension
The ext.cite.referencePreviews module will transparently replace the
ext.popups.referencePreviews module after this patch.  Configuration
stays in Popups for now, we can migrate it in later work.

CSS classes may be renamed in the future but this will be handled
separately since it could be a breaking change for on-wiki
customizations.

A lot of fancy footwork happens in this patch to emulate a soft
dependency on Popups.  This mechanism doesn't exist explicitly in
either ResourceLoader or QUnit, so lots of workarounds are used, to
conditionally load the module and to dynamically skip dependent tests.

renderer.test.js is fully skipped for now, but can be wired up in
later work.

Bug: T355194
Change-Id: I0dc47abb59a40d4e41e7dda0eb7b415a2e1ae508
2024-03-12 17:43:51 +01:00
thiemowmde 644597c402 Move Parsoid-specific CSS into a subdirectory
The first two files have been added to the root modules/ directory
via I487095d in 2015. No problem.

Many, many more files have been added via I000b453 in 2022. It's
really hard to tell what is what since then.

I'm not absolutely sure what the naming convention for this folder
should be. Could as well be "localized-styles/" or just "Parsoid/".

Bug: T156350
Change-Id: Ibcf8c7a6db5400ed8a9811244a070e03ff372a39
2024-03-11 12:42:25 +00:00
thiemowmde 2b32f15c8c Allow visualeditor-cite-tool-name-… messages to be missing
The information read from the …cite-tool-definition.json files is
effectively user input, even if only interface administrators can
edit it. Usually we carefully validate user input. But as of now
this code starts failing with all kinds of uncatched errors.
* An entry with no name, an empty name, or a name that's not a
  string will cause all kinds of undefined behavior.
* An entry with an empty title results in an invisible button.
* A missing message results in a technical <…> placeholder, even if
  the name is usually a sensible fallback.

Note that hard-coding titles as plain text strings in the ….json
file was already possible.

Change-Id: Iddcedbe859e86ac4c3f79a53d36237daff86c0db
2024-03-06 14:21:08 +01:00
thiemowmde c02595bb97 Drop obscure error message about an unused group
The message was part of the original patch that introduced the group
feature in 2009, see https://phabricator.wikimedia.org/rECIT75004e33.
Notice how there was never a test scenario for this message. A test
was added in 2020 via I07738cc.

The message appears only in a rare edge-case when a group is entirely
unused in the text, and only when the group is not empty. The shortest
possible example is:

 <references group=g>
 <ref group=g name=a>a</ref>
 </references>

Just adding something unrelated like `<ref group=g>x</ref>` to the
text changes the error message. Now the group is "used". But this
notion is confusing to begin with. References can be part of a group,
and we can use references, but we can't use groups as if they are a
separate entity.

A better error message already exists.

Notice how this special error message doesn't appear anywhere in the
Parsoid code path. That was already using the other, more generic
error message.

Bug: T269531
Change-Id: I63f663d76e45e6c3d664f145d9a564ee00ff53cd
2024-03-04 13:04:36 +01:00
thiemowmde e5eee2d049 Fix confusing error message with empty group
This is about the error message that currently says:

 »Cite error: <ref> tag with name "a" defined in <references> has
  group attribute "" which does not appear in prior text.«

This is a special error message that appears only when a group name
does not appear anywhere in the text. In all other cases a simpler
error message is shown:

 »Cite error: <ref> tag with name "a" defined in <references> is
  not used in prior text.«

While the first error message is not wrong in the edge-case
described in T269531, it's very confusing for a multitude of
reasons. For example:
* There is no group attribute in the wikitext.
* Just adding something completely unrelated like `<ref>x</ref>` to
  the text shows the other error message.

The reason for this behavior is that the assumed default is an empty
`group=""`. The error message changes the moment any other <ref> in
the same group appears in the text vs. when the group is entirely
unused.

We can probably remove this error message entirely, but should at
least not use it when there is no group.

Notice how the Parsoid code path was already using the other error
message.

Bug: T269531
Change-Id: Ifa2e97254f4cda72233a057d8760fb1116143552
2024-03-04 12:43:26 +01:00
Umherirrender 8580b73320 build: Resolve MediaWikiNoEmptyIfDefined suppression
Follow-Up: I21a5260f36c4fa0d767ec6bba86fcfa35ff0a369
Change-Id: Ib97140baf68bfcc1fae4d9506f73503307f41719
2024-02-16 10:07:13 +00:00
jenkins-bot 2d714727f2 Merge "Remove dead variable from ReferenceListFormatter" 2024-02-10 10:41:38 +00:00
thiemowmde 1eb84d0a8c Remove obsolete rtrim hack added in 2006
This was added in 2006 via commit eb3a3f78, see
https://phabricator.wikimedia.org/rECITeb3a3f78
Hard to tell what happened back then. It's obviously not needed any
more, as proven by the tests. I mean, even if there would be an
extra newline character, it would be irrelevant at the end of an
<ol>…</ol>.

Change-Id: I5715cd9f31ac7ef86c1ea227642336ae71684291
2024-02-08 11:02:12 +01:00
thiemowmde d96e4e97bb Remove dead variable from ReferenceListFormatter
This is unused since Id10db40 from 2019. Looks like we just forgot
it.

Change-Id: I3164f4841572ff19faa584fa2e6ed5aeb7aa2c30
2024-02-08 10:44:16 +01:00
thiemowmde 7c75d44b8a Rename ReferencesFormatter to ReferenceListFormatter
I always found the name a little ambiguous. The fact that it outputs
an actual HTML list and not just some "references" – whatever that
means – is relevant, in my opinion.

Change-Id: I0d169455c8d2b42d62da4dccb8376c09fb6902bc
2024-02-07 18:20:02 +01:00
thiemowmde 802b495160 Make it possible to disable the "cite_error" wrapper message
… as well as "cite_warning". Both are extremely trivial and don't
really do anything by default. All they do is to add the prefix
"Cite error:" or "Cite warning:" to all error messages.

This patch will make it possible to disable both messages by
default, i.e. replace their default in en.json with "-" without
breaking anything. That's part of the plan outlined in T353695.
Local on-wiki overrides will continue to work.

Bug: T353695
Change-Id: I374800d0d0b837cd17ed3a1fdde20b70325b06de
2024-02-01 17:36:27 +01:00
C. Scott Ananian 234da84418 Hook up Parsoid implementation of Cite
This commit also moves certain parser tests involving <ref> from
the Parsoid repo to citeParserTests.txt in this repo.

Bug: T354215
Change-Id: Ie5b211d2af01a56684473723c68a9ab2775542e3
2024-01-19 11:57:11 -05:00
C. Scott Ananian b1cc92b9a1 Add DOM stubs; change namespace of imported Parsoid code to Cite\Parsoid
The namespace change avoids a conflict with the existing Parsoid
implementation in Wikimedia\Parsoid\Ext\Cite and matches the current
Cite codebase better.  We also need to add some phan stubs to allow
Cite to use Parsoid's generic DOM implementation classes, and some
type assertions to satisfy phan.

Bug: T354215
Change-Id: Ic904601b29555c9485a804f131061f207970ddd4
2024-01-17 16:04:30 -05:00
C. Scott Ananian 4f2f6a31e6 Delint imported Parsoid code
Parsoid's phpcs configuration is slightly different from the one in
this repository; this commit just keeps CI happy with the imported
code.

Change-Id: I9ce2993e8a9416f331b5157dfcfb01fb6e31baaf
2024-01-17 15:48:21 -05:00
C. Scott Ananian 8ef2a4a635 Migrate Parsoid implementation of Cite extension to src/Parsoid
Further commits will be necessary to complete the migration, but
this merge commit imports all of the existing history of the Cite
extension.  It was generated using the following command on a checkout
of Parsoid:

  git filter-repo --path src/Ext/Cite --path src/lib/ext/Cite \
    --path lib/ext/Cite --path lib/ext/Cite.js --path lib/ext.Cite.js \
    --path js/lib/ext.Cite.js --path modules/parser/ext.Cite.js \
    --tag-rename '':'parsoid-' \
    --path-rename src/Ext/Cite:src/Parsoid \
    --path-rename src/lib/ext/Cite:src/Parsoid

And then, in the Cite repository:

  git remote add parsoid ../path/to/parsoid/checkout
  git merge parsoid/master --allow-unrelated-histories

Bug: T354215
Change-Id: I54edd9cf7951ca024c66fe357e8777eed85ab13b
2024-01-17 15:47:33 -05:00
thiemowmde 6c1de9de24 Show warning when dir="…" don't match
Same as I294b59f in the Cite codebase.

An additional, necessary change is that we need to track all dir="…"
values in the ReferencesData object, even if we aren't going to use
the value from a <ref name="…" dir="…" /> reuse without content.
This is the same what's done in the ReferenceStack in the Cite
codebase.

Bug: T202593
Depends-On: I294b59f989f553932b40d08308906dd72d92d2cd
Change-Id: Ida38ae6a41e8550089cf7a37a549080d17943521
2024-01-16 20:59:31 +00:00
Adam Wight fa4a3c9405 Only increment sequence when we really mean it
I prefer this to having a mix of roll forward/roll back logic.

Change-Id: I9ac7c2ecf302c4d245a8fda7ed57f14cfb26757c
2024-01-10 13:40:41 +01:00
Adam Wight 9d64d837f1 Simplify "follow" block
There was no need to create a new variable here.

Change-Id: I034bc47c07be3ee716c9a1019addb93f0fb2910a
2024-01-10 13:14:06 +01:00
jenkins-bot 04e1dc66f7 Merge "Move extendsCount into parent ref item" 2024-01-10 11:53:11 +00:00
Adam Wight c36d1b90a5 Move extendsCount into parent ref item
Eliminates a complex shared structure in favor of more encapsulation.

Change-Id: I70efabf0ee263ac578472e16dc35047b0601b7ff
2024-01-10 11:55:16 +01:00
jenkins-bot 58b7f2e16f Merge "Test explicitly for parent ref existence" 2024-01-10 10:35:54 +00:00
Adam Wight 12cd4b979e Test explicitly for parent ref existence
This test was obscured by testing for a field on the parent, but that
would exist if and only if the parent also existed.  Clarify the
guard condition and introduce a named local variable for the parent.

Change-Id: I03079f45cf5ba00d54642c89ac4232a944b2f353
2024-01-09 17:30:10 +01:00
thiemowmde 9f6dd63ef4 Don't search for [[MediaWiki:cite_link_label_group-]]
Such a message shouldn't exist, and doesn't:
https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite+link+label+group-

Additional notes:
* Rename the method to make it more obvious that it's not a cheap
  getter, but doing something slightly more expensive.
* Use more appropriate array_key_exists to check if a cache entry
  already exists.
* Also add a bit more documentation.

Bug: T297430
Bug: T353227
Change-Id: Ia5827bbf6fd700b87a749aac17320796428f0688
2024-01-09 17:00:07 +01:00
Adam Wight 89099e93d6 Rename internal variables
We can be more specific than "value".

Bug: T353451
Change-Id: I8f958a04d7fb6f5a0f10f3c3974b38257ab86f16
2024-01-09 09:53:21 +00:00
Adam Wight 0e01e39061 Encapsulate ref: groupRefs returned as objects
This patch only affects the consumers of groupRefs.

Bug: T353451
Change-Id: I1eff735dbc26dda07aa8ac7af9ea4ddc0906f5a4
2024-01-09 10:22:04 +01:00
Adam Wight f148c65078 Encapsulate ref: pushRef returns an object
This patch affects a few methods which use the output of pushRef.

Bug: T353451
Change-Id: I10b3fe89406c11cdaede92f18a4b96586ecaf5a0
2024-01-09 10:18:57 +01:00
Adam Wight 262fbe24eb Encapsulate ref object: limited to ReferenceStack
This encapsulation gives us field name, type validation and code
documentation.

This patch only affects ReferenceStack and continues to return
approximately the same array outputs to callers.  Some additional
information is included and the placeholder column has a new name.

Bug: T353451
Change-Id: I405fe7ac241f6991fd4c526bfbb58fbc34f2e147
2024-01-09 09:59:16 +01:00
Adam Wight c23b824c34 Reorder conditions
The placeholder field will only be set if the ref exists, so we can
put these in a more logical order.

Change-Id: I2ddfb501fcc3aca936bb45c0d40e4f68c5d2b192
2024-01-09 09:57:03 +01:00
Adam Wight 1434dc5ca6 Switch to a 1-based "count"
The previous patch deprecated the last conditional depending on magic
meanings of 0 and -1, so now we're free to let "count" take on a more
natural meaning: the number of times a footnote mark appears in
article text.

Includes a small hack to avoid changing parser output, by
artificially decrementing the count by one during rendering.  The
hack can be removed and test output updated in a separate patch.

Bug: T353227
Change-Id: I6f76c50357b274ff97321533e52f435798048268
2024-01-08 11:45:36 +01:00
Adam Wight 86edddc8c2 Use semantic field to test ref type
Stop relying on the magic number distinction between "count" = 0 and -1,
by explicitly testing the "name" field instead.

Bug: T353227
Change-Id: I9dce16b01814e19f508d45b927de570049f0e0f5
2024-01-06 16:46:56 +01:00
Adam Wight 6b0ebb3066 Reduce deeply nested variables
These can be hard to read so this patch introduces named, temporary
variables.

PHP reference assignment is helpful here, and has the nice property
of responding correctly to `isset` as if it were called on the
referenced variable.  However, we're prevented from using this trick
in more places in the code because of an unfortunate side-effect that
PHP will store `null` under the referenced array key.  In some cases
(the ones here), this is harmless because we always test using
`isset` and null behaves the same as an unset value.  In other cases
such as arrays that are iterated over, the spurious key and null
value would be more of a nuisance.

Bug: T353227
Change-Id: Ie43592a2f10677ba19842e92fa29eb4bf3be240c
2024-01-06 16:39:46 +01:00
jenkins-bot 0071289f79 Merge "Inline constant for "placeholder" key" 2024-01-05 11:53:19 +00:00
jenkins-bot 0f4c90cc54 Merge "Store group in ref items" 2024-01-05 11:53:17 +00:00
jenkins-bot 9b770fba99 Merge "Include more information in missing parent placeholder" 2024-01-05 11:50:06 +00:00
jenkins-bot 1f7d6527a4 Merge "Render list-defined parent without a backlink" 2024-01-05 11:40:37 +00:00
Adam Wight a6cb979d88 Inline constant for "placeholder" key
Minor refactoring of an internal field, which can be treated like the
other columns.

Change-Id: I255578694c5ab9f2ad3cbe232217af3cea60669c
2024-01-05 11:22:30 +00:00
Adam Wight fd648aec98 Store group in ref items
Encapsulate all information about a ref inside of the internal
structure, rather than relying on the container to be organized by
group.

Bug: T353451
Change-Id: I4c91e8089638b7655bf120402a4a5fcbd1b35452
2024-01-05 11:22:12 +00:00
Adam Wight 76e6e870d4 Include more information in missing parent placeholder
This allows the subreferences to be collected together under a heading.

Bug: T353451
Change-Id: Ibf28f0baca14de8140c87b03ad4aa86d2f81a20d
2024-01-05 11:21:12 +00:00
Adam Wight 5a69c54900 Render list-defined parent without a backlink
In this case, there was never a ref with this name in the article so
no backlinks should be rendered.

TODO:
* test case with empty parent backlink and LDR parent

Bug: T353451
Change-Id: I8a7abd05a48ce83da3beb92b15e894d53252bd33
2024-01-05 12:07:22 +01:00
thiemowmde b01b420199 Track errors in a status object instead of an array
This is another improvement after I7390b68. Status objects are made
to keep track of multiple errors. The only difference is: The merge
method skips duplicates when the message and all parameters are
identical. This causes a minor user-facing change. One of the
shortest possible examples is:

 <references>
 <ref />
 <ref />
 </references>

This showed two identical, indistinguishable error messages before,
but will only show one now. We argue this is fine. The duplicates
are confusing and of (almost) no value to the user. In case the
information is relevant the correct solution is to make the error
messages distinguishable, or introduce a message like "multiple
<ref> tags defined in <references> have the same error". This is
something for a later patch, if needed.

Bug: T353266
Change-Id: I444105462ed24d5ba37b057622b4dc847b40f8d8
2024-01-05 10:49:08 +01:00
jenkins-bot 4d14f9c701 Merge "Merge two code paths about <references> sections" 2024-01-04 16:06:06 +00:00
jenkins-bot 733824005a Merge "Drop unused cite_reference(s)_link_prefix messages" 2024-01-04 16:04:34 +00:00
jenkins-bot ab20cb3cdf Merge "Rename appendText() to resolveFollow()" 2024-01-04 15:29:44 +00:00
thiemowmde ddda536792 Drop unused cite_reference(s)_link_prefix messages
Same as Icfa8215 where we removed the …_suffix messages.

This patch is not blocked on anything according to CodeSearch:
https://codesearch.wmcloud.org/search/?q=cite_references%3F_link_prefix

According to GlobalSearch there are 2 usages we need to talk about:
https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.references%3F.link.prefix.*

zh.wiktionary replaces "cite_ref-" with "_ref-", and "cite_note-"
with "_note-", i.e. they did nothing but remove the word "cite". This
happened in 2006, with no explanation.

ka.wikibooks and ka.wikiquote replace "cite_note-" with "_შენიშვნა-",
which translates back to "_note-". One user did this in 2007,
16 seconds apart.

It appears like both are attempts to localize what can be localized,
no matter if it's really necessary or not.
https://zh.wiktionary.org/wiki/Special:Contributions/Shibo77?offset=20060510
https://ka.wikiquote.org/wiki/Special:Contributions/Trulala?offset=20070219
Note how one user experimented with an "a" in some of the edits to
see what effect the change might have, to imediatelly revert it.

The modifications don't really have an effect on anything, except on
the anchors in the resulting <a href="#_ref-5"> and <sup id="_ref-5">
HTML. It might also be briefly visible in the browser's address bar
when such a link is clicked. We can only assume the two users did this
to make the URL appear shorter (?). A discussion apparently never
happened. Bot users are inactive.

Both pieces of HTML are generated in the Cite code. Removing the
messages will change all places the same time. All links will
continue to work. The only possible effect is that hard-coded
weblinks to an individual reference will link to the top of the
article instead. But:
a) This is extremely unlikely to happen. There is no reason to link
   to a reference from outside of the article.
b) Such links are not guaranteed to work anyway as they can break
   for a multitude of other reasons, e.g. the <ref> being renamed,
   removed, or replaced.
c) Even if such a link breaks, it still links to the correct article.

There is also no on-wiki code on zh.wiktionary that would do anything
with the shortened prefix:
https://zh.wiktionary.org/w/index.php?search=insource%3A%2F_%28ref%7Cnote%29-%2F&title=Special%3A%E6%90%9C%E7%B4%A2&profile=advanced&fulltext=1&ns2=1&ns4=1&ns8=1&ns10=1&ns12=1&ns828=1&ns2300=1

I argue this is safe to remove, even without contacting the mentioned
communities first.

Bug: T321217
Change-Id: I160a119710dc35679dbdc2f39ddf453dbd5a5dfa
2024-01-04 13:17:42 +01:00
thiemowmde ca3203699c Capitalized dir="RTL" should not trigger any error
This fixes a minor issue introduced in I294b59f. Two identical
dir="…" with different capitalizations should not be reported as an
error.

Turns out the implementation in the Cite extension doesn't care
about this capitalization at all. That's why I suggest to do the
normalization as early as possible. This is slightly different in
the Parsoid implementation.

Bug: T202593
Change-Id: I96b4a281d6020d61d1f36ec027cf833bbb244f03
2024-01-03 16:30:16 +00:00
thiemowmde 8b86a4adac Give a different error from too_many_keys when 'follow' attribute conflicts
* Same as Ie64f4ab in the Cite codebase.
* Mark the changed tests as standalone since this Parsoid code isn't yet
  released to vendor and integrated tests run with vendor.

Bug: T299280
Depends-On: Ie64f4ab4831966f66f812ea67cc244718f818afb
Change-Id: I0ea1bc3f57576d215ba4060a0e886e588ffda0b3
2024-01-02 21:27:56 +00:00
Ed Sanders 71c52196f0 Document generated CSS classes
Change-Id: Id4bc9bfbffb278372f2e234861ad521db6d43643
2024-01-02 21:05:40 +00:00
jenkins-bot d8711ce3ed Merge "Show warning when dir="…" don't match" 2023-12-21 21:24:13 +00:00
Adam Wight 4ed93908a3 Ref sequence vs. key
Internal ref key is always an int, but another string `key` is
created in the formatters.  This patch makes the typing explicit.  We
can distinguish between these two different values in a later patch.

Bug: T353451
Change-Id: Id5e40517705961f4d54622e91264430d9f62008d
2023-12-21 10:03:18 +01:00
thiemowmde 6321074484 Remove redundant PHPDoc blocks that are identical to the code
Thanks to strict types and a recent MediaWiki CodeSniffer update a
lot of the PHPDoc comments in this codebase became redundant. Only
very few comments in this codebase contain additional information.
Such comments don't add any new information to what the code alone
already says. We started removing them in many other codebases
already.

In case someone wants to add more documentation to a method the
basic PHPDoc block can usually automatically be generated with a
button press in the IDE.

The only additional change in this patch is that I occasionally
add a missing `void` return type. This is necessary to be able to
remove the comment.

Change-Id: Id7d6d6a437175a9d017f564daf7ed16e76f09158
2023-12-19 17:10:23 +01:00
thiemowmde 3a128dce4b Use correct Sanitizer method for id/fragment escaping
Same as Idf50dad in the Cite extension.

Bug: T298278
Change-Id: Idde6c21e083f42642c3dd2fe64bbd3c4d2b63847
2023-12-19 15:38:36 +00:00
thiemowmde 8094a0ebf5 Move property initializations from constructor to property
This is doing the same as before, in pretty much the same execution
order. The only difference is the syntax.

In JavaScript it's relevant to not do array initializations to early.
Otherwise different instances share the same array. But this doesn't
happen in PHP.

Change-Id: I56363ccadf29f2b806f765ab8f54a3c1863fc10f
2023-12-19 14:57:12 +00:00
thiemowmde bb01b0d74b Merge two code paths about <references> sections
I'm not sure how much this helps. But this merges two code paths
that are both about "we are in the middle of a <references> section
right now.

Nothing changes, as proven by the tests.

Bug: T353266
Change-Id: I446e224b81d35c47736a437d78527c0cc8636f77
2023-12-19 15:04:08 +01:00
thiemowmde d73a76dce6 Rename appendText() to resolveFollow()
There is only 1 user left after Icf16965.

Bug: T353266
Change-Id: I4cafdcbe0a23dd7950613a385cb552e7a84e7f26
2023-12-19 14:49:52 +01:00
thiemowmde b181614ba1 Show warning when dir="…" don't match
This classifies as a "warning" because we still show everything,
just with an error message appended.

Disabling the Parsoid tests right away hopefully makes it easier to
do the same change in Parsoid.

Bug: T202593
Depends-On: If14acd1070617ca8c4d15be6b1759bd47ead4926
Change-Id: I294b59f989f553932b40d08308906dd72d92d2cd
2023-12-19 14:17:30 +01:00
thiemowmde eeb8e28e52 Remove obsolete comment about Sanitizer::safeEncodeAttribute
By now I'm sure this really doesn't belong here. The code in the Cite
extension is doing this because it generates HTML by concatenating
plain strings. In such a context the necessary HTML entity encoding
(&quot; and such) must be done manually. Here in the Parsoid context
this is not needed.

This is split from I7249bd0. See the discussion over there.

Change-Id: I5589e5c2147bfc9f205a0ff80d8bdd247ab49c63
2023-12-18 21:02:36 +01:00
jenkins-bot d79ae15bd0 Merge "Add basic class-level documentation to more classes" 2023-12-18 12:35:58 +00:00
jenkins-bot a10437ad1d Merge "Avoid the term "book referencing" in a few places" 2023-12-18 11:58:33 +00:00
Subramanya Sastry 0754bc9ffd Fix encoding of non-breaking spaces if found in ref names
* This partly replicates the fixes in I9435a2d and Ia01f2fd. More
  to be done in later patches.
* Updated html/parsoid test output (which matches the change in the
  html/php section).

Depends-On: I401656265253a429691cc76adc5db5b129cff2cc
Change-Id: I7249bd03a7942ff7725a20178a051300b777e3a8
2023-12-15 16:38:24 +00:00
thiemowmde 54ac87e5ce Separate ReferenceStack::appendText() from setText()
This moves one more error situation into the stack class, together
with other error situations that are already there.

Bug: T353266
Change-Id: Icf169650f67f64e6d29d175c3b47cf558b8de3d4
2023-12-15 16:41:05 +01:00
thiemowmde 742a9ffbf5 Track warnings separately in ReferenceStack
Check out how this gets rid of so many "to do" as well as
"deprecated" comments.

Next qustion: The elements in the stack become more and more
complicated. It's probably worth converting them from arrays into
first-class objects. But this is for another patch.

Bug: T353266
Change-Id: If14acd1070617ca8c4d15be6b1759bd47ead4926
2023-12-15 16:41:04 +01:00
thiemowmde 13138d4ed0 Avoid the term "book referencing" in a few places
We are discussing this for a long time and finally renamed the tag
on Phabricator: https://phabricator.wikimedia.org/tag/cite-extends

This patch updates only places where it can't have any negative
consequences.

This is also a direct follow-up to Ic73f1b7 where this class was
created.

Bug: T353269
Change-Id: I644fe41d3386b9bf02b83366654301633efd535f
2023-12-15 15:49:04 +01:00
xiplus f7a181ed42 Give a different error from too_many_keys when 'follow' attribute conflicts
Add message "cite_error_ref_follow_conflicts" for tags with
conflicting parameters.

Bug: T299280
Change-Id: Ie64f4ab4831966f66f812ea67cc244718f818afb
2023-12-15 15:23:53 +01:00
thiemowmde e7dea09216 Add basic class-level documentation to more classes
Bug: T353227
Change-Id: I3953543a111121cc49f6ea89988351b80b03e828
2023-12-15 14:27:58 +01:00
thiemowmde 4377f0923d More simple and consistent @covers and @license tags
Same arguments as in Iafa2412. The one reason to use more detailled
per-method @covers annotations is to avoid "accidental coverage"
where code is marked as being covered by tests that don't assert
anything that would be meaningful for this code. This is especially a
problem with older, bigger classes with lots of side effects.

But all the new classes we introduced over the years are small, with
predictable, local effects.

That's also why we keep the more detailled @covers annotations for
the original Cite class.

Bug: T353227
Bug: T353269
Change-Id: I69850f4d740d8ad5a7c2368b9068dc91e47cc797
2023-12-15 12:12:16 +01:00
thiemowmde d0d5fbbee6 Add temporary ErrorReporter::firstError helper function
I hope this makes other refactorings a little easier.

Bug: T353266
Change-Id: Ib574d4d54ba2c8bc1310822539336ad71c4309ef
2023-12-14 17:16:49 +01:00
jenkins-bot 6fc8ee7fec Merge "Get rid of "guarded <references>" terminology" 2023-12-14 14:25:57 +00:00
jenkins-bot 9ec01ef894 Merge "Introduce named constant for "__placeholder__" string" 2023-12-14 14:20:14 +00:00
thiemowmde cb71e87b0e Introduce named constant for "__placeholder__" string
This is a concept that's only relevant when a sub-reference (formerly
known as BookReferencing) appears before the parent reference it
belongs to. Let the name reflect this.

Bug: T353227
Change-Id: Iabf259e72942ea70cb1cc1e0ca5a5d8cf15d7225
2023-12-14 09:45:06 +01:00
thiemowmde 12c7ad7504 Get rid of "guarded <references>" terminology
This patch only moves existing code around without changing any
behavior. What I basically did was merging the old "guardedReferences"
method into "references", and then splitting the resulting code in
other ways. Now we see a few other concepts emerging. But the idea
something would be "guarded" (how?) is gone.

The most critical detail in this patch are the new method names, and
how the code is split. The names should tell a story, and the methods
should do exactly what the name says. Suggestions?

Bug: T353266
Change-Id: I8b7921ce24487e9657e4193ea6a2e3e7d7b0b1c3
2023-12-14 08:44:40 +01:00
thiemowmde a6a0f66130 Extract validation to a separate class
This removes almost 200 lines from the main class.

This patch intentionally doesn't make any changes to the code but
only moves it around. Further improvements are for later patches.

Bug: T353269
Change-Id: Ic73f1b7458b3f7b7b89806a88a1111161e3cf094
2023-12-14 07:43:29 +00:00
jenkins-bot bf53249893 Merge "Move a bit of code out of Cite::guardedReferences" 2023-12-14 02:10:06 +00:00
jenkins-bot 61850c091a Merge "Remove PHPDocs that just repeat what the code already says" 2023-12-14 02:04:15 +00:00
Sam Wilson b03dd1bba8 Load WikiEditor ref toolbar button on other content types
Allow other extensions to provide lists of page content
models for which they want to load the Cite toolbar button.
This will, for example, make it possible for ProofreadPage
to have the button on Page pages.

Bug: T348403
Change-Id: Id28cb0b6cb8a2b86a66b17232575afe513969c54
2023-12-13 21:45:19 +00:00
thiemowmde 04208b5fd1 Remove PHPDocs that just repeat what the code already says
We removed a bunch of now redundant docs already, see e.g. Ie0692fa.

Change-Id: I55c62d935bdec8bedaebc2666fca3eb17309b0c7
2023-12-13 12:44:41 +01:00
jenkins-bot c3aa27f2a1 Merge "Avoid a few isset() in favor of more recent syntax" 2023-12-12 23:58:45 +00:00