mediawiki-extensions-Discus.../includes/CommentItem.php

199 lines
5.6 KiB
PHP
Raw Normal View History

<?php
namespace MediaWiki\Extension\DiscussionTools;
use DOMXPath;
use MWException;
use Title;
Don't refer directly to PHP `dom` extension classes; avoid nonstandard behavior These changes ensure that DiscussionTools is independent of DOM library choice, and will not break if/when Parsoid switches to an alternate (more standards-compliant) DOM library. We run `phan` against the Dodo standards-compliant DOM library, so this ends up flagging uses of non-standard PHP extensions to the DOM. These will be suppressed for now with a "Nonstandard DOM" comment that can be grepped for, since they will eventually will need to be rewritten or worked around. Most frequent issues: * Node::nodeValue and Node::textContent and Element::getAttribute() can return null in a spec-compliant implementation. Add `?? ''` to make spec-compliant results consistent w/ what PHP returns. * DOMXPath doesn't accept anything except DOMDocument. These uses should be replaced with DOMCompat::querySelectorAll() or similar (which end up using DOMXPath under the covers for DOMDocument any way, but are implemented more efficiently in a spec-compliant implementation). * A couple of times we have code like: `while ($node->firstChild!==null) { $node = $node->firstChild; }` and phan's analysis isn't strong enough to determine that $node is still non-null after the while. This same issue should appear with DOMDocument but phan doesn't complain for some reason. One apparently legit issue: * Node::insertBefore() is once called in a funny way which leans on the fact that the second option is optional in PHP. This seems to be a workaround for an ancient PHP bug, and can probably be safely removed. Bug: T287611 Bug: T217867 Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
2021-07-29 02:16:15 +00:00
use Wikimedia\Parsoid\DOM\DocumentFragment;
use Wikimedia\Parsoid\DOM\Text;
use Wikimedia\Parsoid\Utils\DOMCompat;
class CommentItem extends ThreadItem {
private $signatureRanges;
private $timestamp;
private $author;
/**
* @param int $level
* @param ImmutableRange $range
* @param ImmutableRange[] $signatureRanges Objects describing the extent of signatures (plus
* timestamps) for this comment. There is always at least one signature, but there may be
* multiple. The author and timestamp of the comment is determined from the first signature.
* The last node in every signature range is a node containing the timestamp.
* @param string $timestamp
* @param string $author Comment author's username
*/
public function __construct(
int $level, ImmutableRange $range,
array $signatureRanges, string $timestamp, string $author
) {
parent::__construct( 'comment', $level, $range );
$this->signatureRanges = $signatureRanges;
$this->timestamp = $timestamp;
$this->author = $author;
}
/**
* @return array JSON-serializable array
*/
public function jsonSerialize(): array {
return array_merge( parent::jsonSerialize(), [
'timestamp' => $this->timestamp,
'author' => $this->author,
] );
}
/**
* Get the HTML of this comment's body
*
* @param bool $stripTrailingSeparator Strip a trailing separator between the body and
* the signature which consists of whitespace and hyphens e.g. ' --'
Don't refer directly to PHP `dom` extension classes; avoid nonstandard behavior These changes ensure that DiscussionTools is independent of DOM library choice, and will not break if/when Parsoid switches to an alternate (more standards-compliant) DOM library. We run `phan` against the Dodo standards-compliant DOM library, so this ends up flagging uses of non-standard PHP extensions to the DOM. These will be suppressed for now with a "Nonstandard DOM" comment that can be grepped for, since they will eventually will need to be rewritten or worked around. Most frequent issues: * Node::nodeValue and Node::textContent and Element::getAttribute() can return null in a spec-compliant implementation. Add `?? ''` to make spec-compliant results consistent w/ what PHP returns. * DOMXPath doesn't accept anything except DOMDocument. These uses should be replaced with DOMCompat::querySelectorAll() or similar (which end up using DOMXPath under the covers for DOMDocument any way, but are implemented more efficiently in a spec-compliant implementation). * A couple of times we have code like: `while ($node->firstChild!==null) { $node = $node->firstChild; }` and phan's analysis isn't strong enough to determine that $node is still non-null after the while. This same issue should appear with DOMDocument but phan doesn't complain for some reason. One apparently legit issue: * Node::insertBefore() is once called in a funny way which leans on the fact that the second option is optional in PHP. This seems to be a workaround for an ancient PHP bug, and can probably be safely removed. Bug: T287611 Bug: T217867 Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
2021-07-29 02:16:15 +00:00
* @return DocumentFragment Cloned fragment of the body content
*/
Don't refer directly to PHP `dom` extension classes; avoid nonstandard behavior These changes ensure that DiscussionTools is independent of DOM library choice, and will not break if/when Parsoid switches to an alternate (more standards-compliant) DOM library. We run `phan` against the Dodo standards-compliant DOM library, so this ends up flagging uses of non-standard PHP extensions to the DOM. These will be suppressed for now with a "Nonstandard DOM" comment that can be grepped for, since they will eventually will need to be rewritten or worked around. Most frequent issues: * Node::nodeValue and Node::textContent and Element::getAttribute() can return null in a spec-compliant implementation. Add `?? ''` to make spec-compliant results consistent w/ what PHP returns. * DOMXPath doesn't accept anything except DOMDocument. These uses should be replaced with DOMCompat::querySelectorAll() or similar (which end up using DOMXPath under the covers for DOMDocument any way, but are implemented more efficiently in a spec-compliant implementation). * A couple of times we have code like: `while ($node->firstChild!==null) { $node = $node->firstChild; }` and phan's analysis isn't strong enough to determine that $node is still non-null after the while. This same issue should appear with DOMDocument but phan doesn't complain for some reason. One apparently legit issue: * Node::insertBefore() is once called in a funny way which leans on the fact that the second option is optional in PHP. This seems to be a workaround for an ancient PHP bug, and can probably be safely removed. Bug: T287611 Bug: T217867 Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
2021-07-29 02:16:15 +00:00
private function getBodyFragment( bool $stripTrailingSeparator = false ): DocumentFragment {
$fragment = $this->getBodyRange()->cloneContents();
CommentModifier::unwrapFragment( $fragment );
if ( $stripTrailingSeparator ) {
// Find a trailing text node
$lastChild = $fragment->lastChild;
while (
Don't refer directly to PHP `dom` extension classes; avoid nonstandard behavior These changes ensure that DiscussionTools is independent of DOM library choice, and will not break if/when Parsoid switches to an alternate (more standards-compliant) DOM library. We run `phan` against the Dodo standards-compliant DOM library, so this ends up flagging uses of non-standard PHP extensions to the DOM. These will be suppressed for now with a "Nonstandard DOM" comment that can be grepped for, since they will eventually will need to be rewritten or worked around. Most frequent issues: * Node::nodeValue and Node::textContent and Element::getAttribute() can return null in a spec-compliant implementation. Add `?? ''` to make spec-compliant results consistent w/ what PHP returns. * DOMXPath doesn't accept anything except DOMDocument. These uses should be replaced with DOMCompat::querySelectorAll() or similar (which end up using DOMXPath under the covers for DOMDocument any way, but are implemented more efficiently in a spec-compliant implementation). * A couple of times we have code like: `while ($node->firstChild!==null) { $node = $node->firstChild; }` and phan's analysis isn't strong enough to determine that $node is still non-null after the while. This same issue should appear with DOMDocument but phan doesn't complain for some reason. One apparently legit issue: * Node::insertBefore() is once called in a funny way which leans on the fact that the second option is optional in PHP. This seems to be a workaround for an ancient PHP bug, and can probably be safely removed. Bug: T287611 Bug: T217867 Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
2021-07-29 02:16:15 +00:00
!( $lastChild instanceof Text ) &&
$lastChild->lastChild
) {
$lastChild = $lastChild->lastChild;
}
if (
Don't refer directly to PHP `dom` extension classes; avoid nonstandard behavior These changes ensure that DiscussionTools is independent of DOM library choice, and will not break if/when Parsoid switches to an alternate (more standards-compliant) DOM library. We run `phan` against the Dodo standards-compliant DOM library, so this ends up flagging uses of non-standard PHP extensions to the DOM. These will be suppressed for now with a "Nonstandard DOM" comment that can be grepped for, since they will eventually will need to be rewritten or worked around. Most frequent issues: * Node::nodeValue and Node::textContent and Element::getAttribute() can return null in a spec-compliant implementation. Add `?? ''` to make spec-compliant results consistent w/ what PHP returns. * DOMXPath doesn't accept anything except DOMDocument. These uses should be replaced with DOMCompat::querySelectorAll() or similar (which end up using DOMXPath under the covers for DOMDocument any way, but are implemented more efficiently in a spec-compliant implementation). * A couple of times we have code like: `while ($node->firstChild!==null) { $node = $node->firstChild; }` and phan's analysis isn't strong enough to determine that $node is still non-null after the while. This same issue should appear with DOMDocument but phan doesn't complain for some reason. One apparently legit issue: * Node::insertBefore() is once called in a funny way which leans on the fact that the second option is optional in PHP. This seems to be a workaround for an ancient PHP bug, and can probably be safely removed. Bug: T287611 Bug: T217867 Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
2021-07-29 02:16:15 +00:00
$lastChild instanceof Text &&
preg_match( '/[\s\-~\x{2010}-\x{2015}\x{2043}\x{2060}]+$/u', $lastChild->nodeValue ?? '', $matches )
) {
$lastChild->nodeValue =
Don't refer directly to PHP `dom` extension classes; avoid nonstandard behavior These changes ensure that DiscussionTools is independent of DOM library choice, and will not break if/when Parsoid switches to an alternate (more standards-compliant) DOM library. We run `phan` against the Dodo standards-compliant DOM library, so this ends up flagging uses of non-standard PHP extensions to the DOM. These will be suppressed for now with a "Nonstandard DOM" comment that can be grepped for, since they will eventually will need to be rewritten or worked around. Most frequent issues: * Node::nodeValue and Node::textContent and Element::getAttribute() can return null in a spec-compliant implementation. Add `?? ''` to make spec-compliant results consistent w/ what PHP returns. * DOMXPath doesn't accept anything except DOMDocument. These uses should be replaced with DOMCompat::querySelectorAll() or similar (which end up using DOMXPath under the covers for DOMDocument any way, but are implemented more efficiently in a spec-compliant implementation). * A couple of times we have code like: `while ($node->firstChild!==null) { $node = $node->firstChild; }` and phan's analysis isn't strong enough to determine that $node is still non-null after the while. This same issue should appear with DOMDocument but phan doesn't complain for some reason. One apparently legit issue: * Node::insertBefore() is once called in a funny way which leans on the fact that the second option is optional in PHP. This seems to be a workaround for an ancient PHP bug, and can probably be safely removed. Bug: T287611 Bug: T217867 Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
2021-07-29 02:16:15 +00:00
substr( $lastChild->nodeValue ?? '', 0, -strlen( $matches[0] ) );
}
}
return $fragment;
}
/**
* Get the HTML of this comment's body
*
*
* @param bool $stripTrailingSeparator See getBodyFragment
* @return string HTML
*/
public function getBodyHTML( bool $stripTrailingSeparator = false ): string {
$fragment = $this->getBodyFragment( $stripTrailingSeparator );
$container = $fragment->ownerDocument->createElement( 'div' );
$container->appendChild( $fragment );
return DOMCompat::getInnerHTML( $container );
}
/**
* Get the text of this comment's body
*
* @param bool $stripTrailingSeparator See getBodyFragment
* @return string Text
*/
public function getBodyText( bool $stripTrailingSeparator = false ): string {
$fragment = $this->getBodyFragment( $stripTrailingSeparator );
Don't refer directly to PHP `dom` extension classes; avoid nonstandard behavior These changes ensure that DiscussionTools is independent of DOM library choice, and will not break if/when Parsoid switches to an alternate (more standards-compliant) DOM library. We run `phan` against the Dodo standards-compliant DOM library, so this ends up flagging uses of non-standard PHP extensions to the DOM. These will be suppressed for now with a "Nonstandard DOM" comment that can be grepped for, since they will eventually will need to be rewritten or worked around. Most frequent issues: * Node::nodeValue and Node::textContent and Element::getAttribute() can return null in a spec-compliant implementation. Add `?? ''` to make spec-compliant results consistent w/ what PHP returns. * DOMXPath doesn't accept anything except DOMDocument. These uses should be replaced with DOMCompat::querySelectorAll() or similar (which end up using DOMXPath under the covers for DOMDocument any way, but are implemented more efficiently in a spec-compliant implementation). * A couple of times we have code like: `while ($node->firstChild!==null) { $node = $node->firstChild; }` and phan's analysis isn't strong enough to determine that $node is still non-null after the while. This same issue should appear with DOMDocument but phan doesn't complain for some reason. One apparently legit issue: * Node::insertBefore() is once called in a funny way which leans on the fact that the second option is optional in PHP. This seems to be a workaround for an ancient PHP bug, and can probably be safely removed. Bug: T287611 Bug: T217867 Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
2021-07-29 02:16:15 +00:00
return $fragment->textContent ?? '';
}
/**
* Get a list of all users mentioned
*
* @return Title[] Title objects for mentioned user pages
*/
public function getMentions(): array {
$fragment = $this->getBodyRange()->cloneContents();
Don't refer directly to PHP `dom` extension classes; avoid nonstandard behavior These changes ensure that DiscussionTools is independent of DOM library choice, and will not break if/when Parsoid switches to an alternate (more standards-compliant) DOM library. We run `phan` against the Dodo standards-compliant DOM library, so this ends up flagging uses of non-standard PHP extensions to the DOM. These will be suppressed for now with a "Nonstandard DOM" comment that can be grepped for, since they will eventually will need to be rewritten or worked around. Most frequent issues: * Node::nodeValue and Node::textContent and Element::getAttribute() can return null in a spec-compliant implementation. Add `?? ''` to make spec-compliant results consistent w/ what PHP returns. * DOMXPath doesn't accept anything except DOMDocument. These uses should be replaced with DOMCompat::querySelectorAll() or similar (which end up using DOMXPath under the covers for DOMDocument any way, but are implemented more efficiently in a spec-compliant implementation). * A couple of times we have code like: `while ($node->firstChild!==null) { $node = $node->firstChild; }` and phan's analysis isn't strong enough to determine that $node is still non-null after the while. This same issue should appear with DOMDocument but phan doesn't complain for some reason. One apparently legit issue: * Node::insertBefore() is once called in a funny way which leans on the fact that the second option is optional in PHP. This seems to be a workaround for an ancient PHP bug, and can probably be safely removed. Bug: T287611 Bug: T217867 Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
2021-07-29 02:16:15 +00:00
// XXX use DOMCompat::querySelectorAll('a[href]') perhaps
// @phan-suppress-next-line PhanTypeMismatchArgumentInternal Nonstandard DOM
$xPath = new DOMXPath( $fragment->ownerDocument );
Don't refer directly to PHP `dom` extension classes; avoid nonstandard behavior These changes ensure that DiscussionTools is independent of DOM library choice, and will not break if/when Parsoid switches to an alternate (more standards-compliant) DOM library. We run `phan` against the Dodo standards-compliant DOM library, so this ends up flagging uses of non-standard PHP extensions to the DOM. These will be suppressed for now with a "Nonstandard DOM" comment that can be grepped for, since they will eventually will need to be rewritten or worked around. Most frequent issues: * Node::nodeValue and Node::textContent and Element::getAttribute() can return null in a spec-compliant implementation. Add `?? ''` to make spec-compliant results consistent w/ what PHP returns. * DOMXPath doesn't accept anything except DOMDocument. These uses should be replaced with DOMCompat::querySelectorAll() or similar (which end up using DOMXPath under the covers for DOMDocument any way, but are implemented more efficiently in a spec-compliant implementation). * A couple of times we have code like: `while ($node->firstChild!==null) { $node = $node->firstChild; }` and phan's analysis isn't strong enough to determine that $node is still non-null after the while. This same issue should appear with DOMDocument but phan doesn't complain for some reason. One apparently legit issue: * Node::insertBefore() is once called in a funny way which leans on the fact that the second option is optional in PHP. This seems to be a workaround for an ancient PHP bug, and can probably be safely removed. Bug: T287611 Bug: T217867 Change-Id: I3c4f41c3819770f85d68157c9f690d650b7266a3
2021-07-29 02:16:15 +00:00
// @phan-suppress-next-line PhanTypeMismatchArgumentInternal Nonstandard DOM
$links = $xPath->query( './/a', $fragment );
$users = [];
foreach ( $links as $link ) {
$title = CommentUtils::getTitleFromUrl( $link->getAttribute( 'href' ) );
if ( $title && $title->getNamespace() === NS_USER ) {
// TODO: Consider returning User objects
$users[] = $title;
}
}
return array_unique( $users );
}
/**
* @return ImmutableRange[] Comment signature ranges
*/
public function getSignatureRanges(): array {
return $this->signatureRanges;
}
/**
* @return ImmutableRange Range of the thread item's "body"
*/
public function getBodyRange(): ImmutableRange {
// Exclude last signature from body
$signatureRanges = $this->getSignatureRanges();
$lastSignature = end( $signatureRanges );
return $this->getRange()->setEnd( $lastSignature->startContainer, $lastSignature->startOffset );
}
/**
* @return string Comment timestamp
*/
public function getTimestamp(): string {
return $this->timestamp;
}
/**
* @return string Comment author
*/
public function getAuthor(): string {
return $this->author;
}
/**
* @return HeadingItem Closest ancestor which is a HeadingItem
*/
public function getHeading(): HeadingItem {
$parent = $this;
while ( $parent instanceof CommentItem ) {
$parent = $parent->getParent();
}
if ( !( $parent instanceof HeadingItem ) ) {
throw new MWException( 'heading parent not found' );
}
return $parent;
}
/**
* @param ImmutableRange $signatureRange Comment signature range to add
*/
public function addSignatureRange( ImmutableRange $signatureRange ): void {
$this->signatureRanges[] = $signatureRange;
}
/**
* @param ImmutableRange[] $signatureRanges Comment signature ranges
*/
public function setSignatureRanges( array $signatureRanges ): void {
$this->signatureRanges = $signatureRanges;
}
/**
* @param string $timestamp Comment timestamp
*/
public function setTimestamp( string $timestamp ): void {
$this->timestamp = $timestamp;
}
/**
* @param string $author Comment author
*/
public function setAuthor( string $author ): void {
$this->author = $author;
}
}