mediawiki-extensions-Cite/src/AnchorFormatter.php
thiemowmde ddda536792 Drop unused cite_reference(s)_link_prefix messages
Same as Icfa8215 where we removed the …_suffix messages.

This patch is not blocked on anything according to CodeSearch:
https://codesearch.wmcloud.org/search/?q=cite_references%3F_link_prefix

According to GlobalSearch there are 2 usages we need to talk about:
https://global-search.toolforge.org/?q=.&regex=1&namespaces=8&title=Cite.references%3F.link.prefix.*

zh.wiktionary replaces "cite_ref-" with "_ref-", and "cite_note-"
with "_note-", i.e. they did nothing but remove the word "cite". This
happened in 2006, with no explanation.

ka.wikibooks and ka.wikiquote replace "cite_note-" with "_შენიშვნა-",
which translates back to "_note-". One user did this in 2007,
16 seconds apart.

It appears like both are attempts to localize what can be localized,
no matter if it's really necessary or not.
https://zh.wiktionary.org/wiki/Special:Contributions/Shibo77?offset=20060510
https://ka.wikiquote.org/wiki/Special:Contributions/Trulala?offset=20070219
Note how one user experimented with an "a" in some of the edits to
see what effect the change might have, to imediatelly revert it.

The modifications don't really have an effect on anything, except on
the anchors in the resulting <a href="#_ref-5"> and <sup id="_ref-5">
HTML. It might also be briefly visible in the browser's address bar
when such a link is clicked. We can only assume the two users did this
to make the URL appear shorter (?). A discussion apparently never
happened. Bot users are inactive.

Both pieces of HTML are generated in the Cite code. Removing the
messages will change all places the same time. All links will
continue to work. The only possible effect is that hard-coded
weblinks to an individual reference will link to the top of the
article instead. But:
a) This is extremely unlikely to happen. There is no reason to link
   to a reference from outside of the article.
b) Such links are not guaranteed to work anyway as they can break
   for a multitude of other reasons, e.g. the <ref> being renamed,
   removed, or replaced.
c) Even if such a link breaks, it still links to the correct article.

There is also no on-wiki code on zh.wiktionary that would do anything
with the shortened prefix:
https://zh.wiktionary.org/w/index.php?search=insource%3A%2F_%28ref%7Cnote%29-%2F&title=Special%3A%E6%90%9C%E7%B4%A2&profile=advanced&fulltext=1&ns2=1&ns4=1&ns8=1&ns10=1&ns12=1&ns828=1&ns2300=1

I argue this is safe to remove, even without contacting the mentioned
communities first.

Bug: T321217
Change-Id: I160a119710dc35679dbdc2f39ddf453dbd5a5dfa
2024-01-04 13:17:42 +01:00

94 lines
2.7 KiB
PHP

<?php
namespace Cite;
use MediaWiki\Parser\Sanitizer;
/**
* Compiles unique identifiers and formats them as anchors for use in `href="#…"` and `id="…"`
* attributes.
*
* @license GPL-2.0-or-later
*/
class AnchorFormatter {
/**
* Return an id for use in wikitext output based on a key and
* optionally the number of it, used in <references>, not <ref>
* (since otherwise it would link to itself)
*
* @param string|int $key
* @param string|null $num The number of the key
*
* @return string
*/
private function refKey( $key, ?string $num ): string {
if ( $num !== null ) {
$key = $key . '_' . $num;
}
return $this->normalizeKey( "cite_ref-$key" );
}
/**
* @param string|int $key
* @param string|null $num
* @return string Escaped to be used as part of a [[#…]] link
*/
public function backLink( $key, ?string $num = null ): string {
$key = $this->refKey( $key, $num );
// This does both URL encoding (e.g. %A0, which only makes sense in href="…") and HTML
// entity encoding (e.g. &#xA0;). The browser will decode in reverse order.
return Sanitizer::safeEncodeAttribute( Sanitizer::escapeIdForLink( $key ) );
}
/**
* @param string|int $key
* @param string|null $num
* @return string Already escaped to be used directly in an id="…" attribute
*/
public function backLinkTarget( $key, ?string $num ): string {
$key = $this->refKey( $key, $num );
return Sanitizer::safeEncodeAttribute( $key );
}
/**
* Return an id for use in wikitext output based on a key and
* optionally the number of it, used in <ref>, not <references>
* (since otherwise it would link to itself)
*
* @param string $key
*
* @return string
*/
private function getReferencesKey( string $key ): string {
return $this->normalizeKey( "cite_note-$key" );
}
/**
* @param string $key
* @return string Escaped to be used as part of a [[#…]] link
*/
public function jumpLink( string $key ): string {
$key = $this->getReferencesKey( $key );
// This does both URL encoding (e.g. %A0, which only makes sense in href="…") and HTML
// entity encoding (e.g. &#xA0;). The browser will decode in reverse order.
return Sanitizer::safeEncodeAttribute( Sanitizer::escapeIdForLink( $key ) );
}
/**
* @param string $key
* @return string Already escaped to be used directly in an id="…" attribute
*/
public function jumpLinkTarget( string $key ): string {
$key = $this->getReferencesKey( $key );
return Sanitizer::safeEncodeAttribute( $key );
}
private function normalizeKey( string $key ): string {
// MediaWiki normalizes spaces and underscores in [[#…]] links, but not in id="…"
// attributes. To make them behave the same we normalize in advance.
return preg_replace( '/[_\s]+/u', '_', $key );
}
}