The existing comment IDs can't be used to find the same comment on
a different revision or page (when it's transcluded), because they
depend on the comment's parent and its position on the page.
Comment names depend only on the author and timestamp. The trade-off
is that they can't distinguish comments posted within the same minute,
or in the same edit, so we will still need the IDs sometimes.
Prefer using comment names when replying, if they're not ambiguous.
This fixes T273413 and T275821.
Heading names depend on the author and timestamp of the oldest comment.
This way we don't have to detect changes to the heading text, but we
can't distinguish headings without any comments.
Bug: T274685
Bug: T273413
Bug: T275821
Change-Id: Id85c50ba38d1e532cec106708c077b908a3fcd49
The code we're testing already produces a string of serialized HTML,
no need to parse and re-serialize it.
Also, we recently learned that the precise format matters here
(T274709), and now this test *actually* covers the fix for that bug.
Follow-up to 5b26e9664b.
As a downside, this test might now spuriously fail if the format of
the output of Parsoid's XMLSerializer changes. Hopefully that won't
happen too often.
Change-Id: I69b514f545e47dcb437fb39a83edb8e2f19ed99b
Now it detect signatures generated by en.wp's {{Undated}} template,
and signatures of people who do weird stuff to the timestamps.
Bug: T275938
Change-Id: I27b07f6786ca5433a3c02a5fe68e4716d41401bb
Documentation:
https://www.mediawiki.org/wiki/Manual:PHP_unit_testing/Writing_unit_tests_for_extensions#Two_types_of_tests
We can do this because the tested methods do not depend on any globals
or on MediaWiki being installed.
In addition to being the new hotness, MediaWikiUnitTestCase allows the
test classes that use it instead of MediaWikiTestCase to start up much
faster. In my testing, running this test case individually now takes
0.35s, compared to 1.1s before.
Try:
* With new code:
time php tests/phpunit/phpunit.php extensions/DiscussionTools/tests/phpunit/unit/CommentUtilsTest.php
* With old code:
time php tests/phpunit/phpunit.php extensions/DiscussionTools/tests/phpunit/CommentUtilsTest.php
Change-Id: I771b1f3d101a394ee869e42547d9ae7839397752
Add yet another tree walking utility: CommentUtils::linearWalk().
Unlike TreeWalker, it allows handling the beginnings and ends of nodes
separately – kind of like parsing a XML token stream, or kind of like
VisualEditor's linear model.
(Add unit tests for this utility. The simple.html test case is copied
from [VisualEditor/VisualEditor]/demos/ve/pages/simple.html.)
Use this utility to stop skipping when we reach either a closing or
opening block node tag. Previously we'd skip over such tags inside
nested "transparent" nodes (like <a>, <del>, or apparently <font>).
Bug: T271385
Change-Id: I201a942eb3a56335e84d94e150ec2c33f8b4f4e0
As CommentFormatter no longer needs HTMLFormatter, remove
the inheritance and make addReplyLinks a static method.
Testing locally this is marginally slower, going from 2.55s
to 2.9s for the CommentFormatterTest case.
Bug: T266317
Bug: T267973
Change-Id: If69749cae678a1647a138d782a32032189f55cec
After recent changes allowing ThreadItems to have IDs, they can now
also have warnings about duplicate IDs.
Bug: T267035
Change-Id: If3edfe34e6e29741e29fac8946a3c88badc4ab7f
We avoided fixing these because it causes changes in just about all of
the test data, which is annoying when reviewing or blaming changes.
But the previous several commits also caused changes in just about all
of the test data, so we might as well do this too.
Change-Id: I83b64d83b6f12c04dc06c0cadff7cdd89417e137
Our threads now also contain all replies to their sub-threads.
This is similar to how sections work in MediaWiki, where the parent
section also contains the content of all the lower-level sections.
We're going to need this for notifications about replies in a thread.
Bug: T264478
Change-Id: I241fc58e2088a7555942824b0f184ed21e3a8b6f
I haven't really reviewed the outputs, but at least a) they don't
crash b) they will fail if the output suddenly changes (which could
cause problems due to caching).
Bug: T252555
Change-Id: I1bbcbc5dd17ce1e24b3622062f5e8df4baf5f389
Also, add tests covering this and the previous bug fixes in this code
(T259818, T261706).
Note that the test data added in tests/cases/ doesn't exactly match
the entire configuration of the wiki, only the parts we want to cover.
This is unlike the data in tests/data/, which was literally copied
from the relevant wikis, and which is used as input for other tests.
Bug: T265500
Change-Id: I29a59a5952f6dc9fb5910434bb6bcc9dcdaa01a9
* Export parser data (date format, digits, timezone names, and
messages for weekday/month names) converted to language variants
* Update the parsers to try matching using every variant, in case
the page is displayed in non-default variant (and to avoid
problems with incomplete variant conversion)
Bug: T259818
Change-Id: I04d73992cd31ce06fa79f87df0c0a53d7efc3c58
* Remove 'wgMetaNamespace' and 'wgMetaNamespaceTalk', the same data
exists in 'wgFormattedNamespaces'.
* Rename 'wgContentLang' to 'wgContentLanguage', to match its real
name in JS config. MediaWiki doesn't use 'wgContentLang' anywhere,
although the related PHP global is called $wgContLang.
* Document how I made these files, previously only mentioned in the
commit message of e9c401e3aa.
Change-Id: I67f962812c155aedf41154e0d837e7feb5af972d
Causes page corruption, in a new way we haven't seen before.
* Revert "Move page updating logic to controller.js"
This reverts commit 54fdc6de06.
* Revert "ReplyWidget: Move clear methods from #teardown to #clear"
This reverts commit 9b811a94e0.
* Revert "ApiDiscussionToolsEdit: Do not pass 'basetimestamp'"
This reverts commit 7de5938a6f.
* Revert "Use DOMCompat::getOuterHTML instead of doc->saveHTML()"
This reverts commit 7b2448d2f0.
* Revert "CommentController: Remove remains of client-side edit conflict handling"
This reverts commit 2d038af705.
* Revert "Restore error message for when comment is deleted while replying"
This reverts commit 655c0526d6.
* Revert "Use transcluded from API to avoid ever fetching Parsoid DOM in client"
This reverts commit 9d0fc184fe.
* Revert "Create a 'transcludedfrom' API endpoint"
This reverts commit 5d8f3b9051.
* Revert "Edit API for replies"
This reverts commit 8829a1a412.
Bug: T259855
Change-Id: I6419408c6194ec0afa6b8ee604b12c1a24c6ac7b
Previously, parser would output offsets that don't exist in their
containers, because we were pretending that entities are parts of
their neighboring text nodes.
Turns out it's much easier to do it right when going backwards.
Change-Id: I9bccca2d403f1a976ae517449989170cdd99721e
Follow-up to ccd9e411d2.
* Fix variable name in CommentTestCase::overwriteHtmlFile()
* Overwrite before assertions, because they abort execution if they fail
Change-Id: I5bba016ba93f9dd1994325ae82c3105ba11cf033
The latter results in lots of extra HTML entity encoding.
The former is built by the Parsing team and appears to result
in no unexpected changes elsewhere in the document.
As Parsoid's selser relies on HTML fragments being byte-for-byte
equal, these changes were resulting in wikitext normalisations
in untouched parts of the document ("dirty diffs").
Bug: T259855
Change-Id: Ib3cb605911e690ec3e8c2f9df25fd1a2e2849d7e
This is similar to the code we already have in JS tests, but instead
of printing to the console where you have to copy-paste from, it just
overwrites the files.
Also, update all of the expected results by this method.
Changes in the expected outputs:
* In JSON files, the "warnings" are now always in the same place
regardless of the type of the warning.
* In all HTML files, self-closing tags now include the trailing slash,
some characters are no longer encoded as entities when not necessary,
and attributes may be single-quoted when that makes them shorter.
* In Parsoid HTML files, the header is no longer terribly mangled.
Other notes:
* CommentParserTest.php: Change the output of serializeComments()
to be in similar order as in JS, to reduce the diffs in this commit
and because it's a better order for humans.
* modifier.test.js: Remove some hacks that were working around small
inconsistencies between the previous expected outputs and the actual
outputs.
Change-Id: I9f764640dae823321c0ac35898fa4db03f1ca364
* Remove the existing approach for detecting signatures that only
worked in source mode; remove autoSignWikitext()
* Use the same approach for auto-signing in source mode as we have
already used in visual
* In both modes, detect whether the user has already typed a signature
at the end of their comment in the modifier, and if so, don't add a
signature
* Add test cases for the detection
Bug: T255738
Change-Id: I791d3035cb1ffc33ce3966d4617a25d08700c35b
* Pass rootNode to the constructor
* Rename getters to match CommentItem/HeadingItem/ThreadItem
value classes.
* Always build the thread tree so CommentItem's always have
and ID and replies/parent.
Change-Id: I508be9534de59016ff806e3d84edcbb1c76cb0c6
Rather than the <body> node, we were passing <body>'s first child.
Current implementation of CommentParser::getComments() doesn't fail
the tests in spite of this because the XPath query incorrectly returns
results relative to the document's real root node, but these tests
would start failing after I2441f33e6e7bad753ac830d277e6a2e81ee8c93d.
Follow-up to 3e6ab2c4d2.
Change-Id: Ic26e0a1ee4443987e215c5f26ef1f084ccd0b40b
The expected HTML was wrong, a '<br />' tag inside 'data-mw' was
somehow turned into '<br ></span>'. No idea how that happened.
Something must be wrong with the HTML parsing in JS tests, which
were used to generate this file.
Change-Id: I69caa68fe70e706df81e8adf29889254704f601e
This method shouldn't be required on the server. Leave comments
relating to it in addListItem so JS & PHP can be kept in sync.
Change-Id: I849fac660faf6e750272c20776f96b9250f96b1b
In JS, strings are internally encoded as UTF-16, and properties like
.length return values in UTF-16 code units.
In PHP, strings are internally encoded as UTF-8, and we have the
option of using methods that return bytes like strlen() or UTF-8 code
units like mb_strlen().
However, the values produced by preg_match( …, PREG_OFFSET_CAPTURE )
are in bytes, and there's nothing we can do about that. So let's use
bytes throughout, mixing the two types results in meaningless numbers.
Then in the test code, we have to calculate UTF-16 code units offsets
based on the UTF-8 byte offsets.
We also have to copy the entire workaround for mw:Entity nodes… Maybe
the parser should be fixed to return the real nodes for ranges' ends
in this case.
Change-Id: I05804489d7de0d60be6e9f84e6a49a885e9fb870
In JS tests, we load the documents via mw.template, which apparently
causes the <html>, <head> and <body> tags to disappear, resulting
in the ranges not matching in PHP tests (and the real document).
Put in a big hack that makes them match, and update the JSON files.
Change-Id: I8194752cd5f82c3716c99e76a37226af5d4a0ec1
Profiling reveals that >87% of the run time of our test suite is spent
in this tiny method. Apparently, DOMNodeList::item() is extremely slow
(possibly it's linear time instead of constant time?).
Profiled using XDebug and KCacheGrind:
https://phabricator.wikimedia.org/F31815264
We can calculate the child's index in its parent by counting its
precending siblings instead, which turns out to be much faster.
Before:
1. 275444ms to run DiscussionToolsCommentParserTest:testGetComments with data set #2
2. 12668ms to run DiscussionToolsCommentParserTest:testGetComments with data set #3
...
After:
1. 9545ms to run DiscussionToolsCommentParserTest:testGetComments with data set #2
2. 5549ms to run DiscussionToolsCommentParserTest:testGetComments with data set #3
...
That's still kind of slow but now it's bearable to run the test suite.
Change-Id: I49155f7aa2e231a9a20bf282cf6aaa28fc902e0b
* Not to be confused with the Parsing Team's
"Great Parser JS to PHP port of 2019"
Gasp as OR hacks are changed to null coalescing operators.
Applaud as variable declarations are dropped.
Cheer as parameters and return values are type-hinted.
Shudder as DomNodeLists have no indexOf method.
Moving discussion parsing to the server should allow
us to implement much cleaner APIs for commenting.
Bug: T252252
Co-authored-by: Ed Sanders <esanders@wikimedia.org>
Change-Id: Ic1438d516e223db462cb227f6668e856672f538c