Commit graph

163 commits

Author SHA1 Message Date
Bartosz Dziewoński ba8434e2e0 Add legacy IDs as of wmf/1.36.0-wmf.14
Bug: T264478
Change-Id: I099e1068fedc25d671cd1245ac8b32941dca7232
2020-10-22 22:52:59 +02:00
Bartosz Dziewoński 0ddc171c8a Add oldest timestamp in the thread to heading IDs
To avoid old threads re-appearing on popular pages when someone
uses a vague title (e.g. dozens of threads titled "question" on
[[Wikipedia:Help desk]]: https://w.wiki/fbN), include the oldest
timestamp in the thread (i.e. date the thread was started) in the
heading ID.

Bug: T264478
Change-Id: If918bfd5e025248923d1939bc86916697ead95a0
2020-10-22 02:19:21 +02:00
Bartosz Dziewoński b09bbfe668 Disambiguate comments by parent ID, rather than sequential numbers
Sequential numbers aren't great because they change when an earlier
comment is archived. Parent comment/heading IDs should change less
often.

This also makes much more sense for disambiguating subsections,
e.g. a dozen identical ===Votes=== sections for a dozen proposals.

Bug: T264478
Change-Id: I466454984fd919ebef35f2b37ddb5d86dc842996
2020-10-22 02:19:21 +02:00
Bartosz Dziewoński 3137d76f40 Connect sub-threads to their parent threads
Our threads now also contain all replies to their sub-threads.
This is similar to how sections work in MediaWiki, where the parent
section also contains the content of all the lower-level sections.
We're going to need this for notifications about replies in a thread.

Bug: T264478
Change-Id: I241fc58e2088a7555942824b0f184ed21e3a8b6f
2020-10-22 02:05:02 +02:00
Bartosz Dziewoński 9ee0fd69f5 Allow headings to have IDs
Previously, only comments could have IDs, because we only needed IDs
for replying. But we might also use them for notifications soon.

Bug: T264478
Change-Id: I1bcad02bf17ab54bc5028a959543c10f0430836b
2020-10-22 02:04:28 +02:00
Bartosz Dziewoński 6719d17364 Handle cached "legacy" IDs (and other JSON-serialized data)
The output of CommentFormatter::addReplyLinks() and consequently
ThreadItem::jsonSerialize() can end up in the HTTP cache (Varnish) on
Wikimedia wikis. We need to consider that when changing that code.

Introduce a concept of legacy ID (generated by the older algorithm
after it changes), add some placeholder code that will generate them
in the future, and update some code to find comments by either normal
or legacy IDs.

Add dire comments in a bunch of places (as if that ever helps).

Bug: T264478
Change-Id: I4368f366800ab21b8b184b09378037614fdecd33
2020-10-22 00:53:06 +02:00
Bartosz Dziewoński 3b8d63467e CommentParser: Remove confused comments about references and objects
"This modifies the original objects…" – I feel like this is obvious
now, but maybe it wasn't so obvious when this code was structured
differently before a2431fe006. Also,
it refers to a variable that doesn't exist.

"FIXME this will clone the reply…" – No, actually, it will not.
It would if replies were associative arrays, but they are objects,
and have always been, ever since the PHP parser was merged in
7b7a2cd69c. Maybe they were arrays
once in Roan's mind before he pushed that for review.

Change-Id: I1348e111699fdbde99cd1f9ef45d8f465f7391b0
2020-10-21 21:01:27 +02:00
Bartosz Dziewoński 1ad6389292 ImmutableRange: Optimize parent check in computePosition()
We can check whether a node is a child of another node directly,
without iterating over all its children.

Change-Id: I3a26df89365bf765348d96b477c983ec9c4e43fe
2020-10-20 11:14:00 +02:00
Ed Sanders d0bcec6196 Enable DT server side via a cookie to preserve user enable hack
Bug: T265499
Change-Id: Ied330c633732651d1c4e136646afd676ceb570c7
2020-10-19 21:42:58 +01:00
Bartosz Dziewoński 88b5be11fd CommentParser: Avoid unnecessary reference in foreach()
This is not necessary, and never has been. This variable contains an
object and it's never assigned to.

Instead, the reference creates hard-to-debug bugs (I've just spent
an hour debugging one). When the variable name is reused later in
the same function as the loop variable of another foreach() loop
(such as in If918bfd5e0), the result is overwriting of the last entry
in $this->threadItems with the last entry from the other array.
I was questioning everything I know about variables until I noticed.

Change-Id: Ibb57f915b39dd4d6d2e744903f9ecadd67b1f52d
2020-10-15 17:58:06 +02:00
jenkins-bot a2cf9cc978 Merge "Correctly generate timezone abbreviations for parsing" 2020-10-15 15:24:14 +00:00
Bartosz Dziewoński a1dc3a4896 Correctly generate timezone abbreviations for parsing
Also, add tests covering this and the previous bug fixes in this code
(T259818, T261706).

Note that the test data added in tests/cases/ doesn't exactly match
the entire configuration of the wiki, only the parts we want to cover.
This is unlike the data in tests/data/, which was literally copied
from the relevant wikis, and which is used as input for other tests.

Bug: T265500
Change-Id: I29a59a5952f6dc9fb5910434bb6bcc9dcdaa01a9
2020-10-15 12:11:25 +00:00
Bartosz Dziewoński fe520ab175 ImmutableRange: Remove unused variable
The only use was removed in code review, but we missed this.

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/DiscussionTools/+/596813/70..71/includes/ImmutableRange.php

Change-Id: Ia761a6a350711c32f0bd4c8af48c7c1a35a4afb4
2020-10-15 11:23:43 +00:00
Ed Sanders 64392af485 Add reply links on the server
Bug: T252555
Change-Id: I4e60fdbc098c1a74757d6e60fec6bcf8e5db37c1
2020-10-08 13:27:08 +01:00
Bartosz Dziewoński ed17f640b6 Ignore other empty-ish things at the beginning of comments
Follow-up to 432a959436.

Bug: T264116
Change-Id: I0685cafab70c7e9d22f504f1a1309c9a28d6f2e1
2020-09-30 23:42:47 +02:00
jenkins-bot 91e87eb09a Merge "Fix detecting username from the wrong links sometimes" 2020-09-30 18:18:02 +00:00
jenkins-bot f7c7ba3c44 Merge "Ignore empty paragraphs at the beginning of comments" 2020-09-30 18:16:36 +00:00
Bartosz Dziewoński 17b7a481a2 Fix detecting username from the wrong links sometimes
When a timestamp directly followed a `<div>…</div>` tag (or perhaps
some other wrapper containing lots of content), we would detect the
username from the earliest links in the wrapper (furthest from the
timestamp), rather than the latest links (closest to the timestamp).

Bug: T262573
Change-Id: Id16449a86a731b13dc79846bb30ecf6554e26f1d
2020-09-29 22:31:24 +02:00
Bartosz Dziewoński 432a959436 Ignore empty paragraphs at the beginning of comments
The wikitext parser outputs `<p><br></p>` for empty paragraphs, so we
need to ignore `<br>` tags when searching for an "interesting" node
that marks the beginning of a comment. Otherwise the empty paragraphs
mess up the detection of indentation levels.

Bug: T264116
Change-Id: I84a97ab577baa7336b78935ccdc48041ecfc231a
2020-09-29 22:22:35 +02:00
Ed Sanders 6b8312e610 Ignore HTML comments which are more than two lines from a reply
Bug: T264026
Change-Id: I989132d7599a7fa156dba46d87a9ed4b76322c0c
2020-09-29 11:30:03 +01:00
Ed Sanders 2a03b6e420 Add JSON serialize methods
Change-Id: Iaa24c39842cbe76370aaeba01eab2c991d290b8c
2020-09-23 20:35:57 +01:00
jenkins-bot b1804fbb8b Merge "Add content getters to Thread/Comment items" 2020-09-21 22:35:24 +00:00
jenkins-bot 101dda4200 Merge "Implement cloneContents on ImmutableRange" 2020-09-21 22:35:23 +00:00
jenkins-bot 636ca06e7e Merge "Fix parsing links in Parsoid documents without short URLs" 2020-09-17 20:40:40 +00:00
Bartosz Dziewoński 329df8c953 Parsing discussions converted to language variants
* Export parser data (date format, digits, timezone names, and
  messages for weekday/month names) converted to language variants
* Update the parsers to try matching using every variant, in case
  the page is displayed in non-default variant (and to avoid
  problems with incomplete variant conversion)

Bug: T259818
Change-Id: I04d73992cd31ce06fa79f87df0c0a53d7efc3c58
2020-09-16 22:07:07 +00:00
jenkins-bot abfbc3cf95 Merge "ApiDiscussionToolsEdit: Use PARAM_TYPE => 'string' for single-line textfields" 2020-09-16 19:11:41 +00:00
jenkins-bot 55a4e376fe Merge "Documentation fix" 2020-09-14 23:46:08 +00:00
Ed Sanders 92a8ca3469 Documentation fix
Change-Id: Ic37bb713bed8af7390a3e8be7ea0203b4687ce0e
2020-09-15 01:38:54 +02:00
Bartosz Dziewoński 069ea8952d ApiDiscussionToolsEdit: Use PARAM_TYPE => 'string' for single-line textfields
This only affects the interface on Special:ApiSandbox.

Change-Id: I6e3524cf59b7b6044a4efb66d5833636e5c60adc
2020-09-15 00:44:22 +02:00
jenkins-bot c02f54a848 Merge "Edit summary in advanced mode" 2020-09-14 22:29:59 +00:00
Ed Sanders ebaa132ebf Edit summary in advanced mode
Bug: T249391
Change-Id: Ibfd1cf77293d30b8e0d9869bbdc9aae2aa830a06
2020-09-14 22:29:54 +01:00
jenkins-bot 97e8935ce6 Merge "Create preference for turning off reply tool once out of beta" 2020-09-12 19:18:26 +00:00
Ed Sanders 5ced26b718 Add content getters to Thread/Comment items
Change-Id: I29cd36eb63bfa00e333efb3c6d7d7e1cdc58cf65
2020-09-12 13:19:36 +01:00
Ed Sanders e329027bf3 Implement cloneContents on ImmutableRange
Change-Id: Ife6bdbe8d7af2da6662814459f0e8726482034f4
2020-09-12 13:19:35 +01:00
Ed Sanders 38b5a9c0fe Create preference for turning off reply tool once out of beta
Bug: T259943
Depends-On: I75d38560a649794effbfc07cab1de0cc2e18f57b
Change-Id: Iec472bddd92e821d924a18db1734e8da6188b3dd
2020-09-11 22:16:57 +00:00
Ed Sanders d4f67918b2 Skip over whitespace when looking for trailing comments
Bug: T257651
Change-Id: Icce377f1833b80bd066622d6be3e711a18c58eea
2020-09-11 15:37:09 +01:00
Ed Sanders f96678d562 Factor out availability check
Change-Id: Ibc3e43ea50b7a38cf36f3ee2e479701d4489182e
2020-09-10 15:25:49 +01:00
C. Scott Ananian 965f317015 Use Language::formatNumNoSeparators where appropriate
Avoids using the deprecated $noSeparators parameter to Language::formatNum
in favor of Language::formatNumNoSeparators, which has been around since
MW 1.21.

Change-Id: I012434d5f6c749fec45a6c160e8d5d03686192e9
2020-09-09 18:15:08 -04:00
Ed Sanders 2cc39f2b84 Add topic API
Bug: T259563
Change-Id: I93bb93c272f9b30df81f7f978a3952ca1d027fec
2020-09-09 20:26:41 +00:00
Bartosz Dziewoński 14fb013515 Match handling of "signature scan limit" between JS and PHP
PHP was counting UTF-8 bytes, JS was counting UTF-16 bytes.
Both should have been counting codepoints (although it doesn't
really matter as long as they both count the same things).

I noticed the issue after adding some tests using the Cyrillic
script, when one case had different results in PHP and JS:
Id25b537fecd789640c209ff7f30e777455a3aece.

Change-Id: Ic31240678f71ba48e6ec202126bf490cea12bb66
2020-09-08 03:27:01 +02:00
Bartosz Dziewoński 10899af666 Fix parsing links in Parsoid documents without short URLs
Move the code so that we check for "?title=" query parameter first,
because we don't handle this right in the other code path.

Use parse_url() instead of wfParseUrl() because the latter doesn't
accept relative URLs, and we don't care about the other differences.

Bug: T261711
Depends-On: I4da952876e1c3d1a41d06b51f7e26015ff5e34d7
Change-Id: I70fac2b41befd782b0a47a4f726ae748dc0f775d
2020-09-02 23:42:37 +02:00
Bartosz Dziewoński 2d3fe47ac1 Fix parsing localised digits in PHP discussion parser
The PHP code incorrectly assumed that the digits are single-byte in
UTF-8, which is never the case (except for 0-9).

The JS code worked correctly because it uses UTF-16 strings, so the
bug would only affect non-BMP digits there. This was noted in a TODO
comment, but we overlooked it when reimplementing in PHP.

Instead of a string of 10 characters, use an array of 10
single-character strings.

Bug: T261706
Change-Id: Ic5421382474c88f003424799c53ff473d99cce92
2020-09-01 01:50:33 +02:00
jenkins-bot f167bfc5b1 Merge "Don't trust RESTBase to give us the correct revision" 2020-08-26 20:27:22 +00:00
jenkins-bot bc80b51fcd Merge "Fix typo in API help message key" 2020-08-25 13:28:55 +00:00
Ed Sanders d3205b877f Fix typo in API help message key
Change-Id: Idb4e973f4adb3affa857f99680707e35676810fa
2020-08-25 13:24:31 +01:00
Ed Sanders 17ba57edd6 Don't trust RESTBase to give us the correct revision
As we do in VE, extract the revid from the document.

Unlike in VE we don't need to throw an error if there is
a mis-match, as we will likely be able to make the edit anyway.
Just use the ID we got from the document.

Log a warning if there is ever an ID mis-match so we
can evaluate if this check is actually needed.

Change-Id: I94c37980524a9faabac49495903a5262387af562
2020-08-24 17:24:50 +01:00
Bartosz Dziewoński e36dc8e78a Skip to the end of the paragraph in the parser, not modifier
When a comment ended before the end of a paragraph, the next
comment would begin right there in the middle of the paragraph.
This could result in the detected indentation level of that
comment being incorrect, and replies being inserted in wrong
places, as seen in the 'signatures-funny' test case.

The code moved to the parser was previously repeated twice in
addListItem() and addReplyLink(), which should have been a hint
that something isn't quite right.

Also, fix the code guarding against overlapping signatures,
now that signatures may not be at the end of a comment.

Bug: T260855
Change-Id: Ic26a87642f8a15d5de2f7073d4d8176b299c7f94
2020-08-20 19:35:55 +00:00
Bartosz Dziewoński 84cb9d1dca parser: Code quality tweaks
Do things in a more intuitive order, avoid some repetition,
rename a vaguely named variable.

Change-Id: Ic1a0bb54134682eaf126231e04eb67847d6a5da6
2020-08-20 20:52:42 +02:00
Bartosz Dziewoński b706eac8bc Re-apply new reply API patches (again)
This reverts commit 4d7c98b97c.

Change-Id: I4100521efb687ec324d25e273a9c986fd5dac0d0
2020-08-19 20:05:42 +00:00
Bartosz Dziewoński 4d7c98b97c Revert new reply API (again)
Causes page corruption, in a new way we haven't seen before.

* Revert "Move page updating logic to controller.js"
  This reverts commit 54fdc6de06.
* Revert "ReplyWidget: Move clear methods from #teardown to #clear"
  This reverts commit 9b811a94e0.
* Revert "ApiDiscussionToolsEdit: Do not pass 'basetimestamp'"
  This reverts commit 7de5938a6f.
* Revert "Use DOMCompat::getOuterHTML instead of doc->saveHTML()"
  This reverts commit 7b2448d2f0.
* Revert "CommentController: Remove remains of client-side edit conflict handling"
  This reverts commit 2d038af705.
* Revert "Restore error message for when comment is deleted while replying"
  This reverts commit 655c0526d6.
* Revert "Use transcluded from API to avoid ever fetching Parsoid DOM in client"
  This reverts commit 9d0fc184fe.
* Revert "Create a 'transcludedfrom' API endpoint"
  This reverts commit 5d8f3b9051.
* Revert "Edit API for replies"
  This reverts commit 8829a1a412.

Bug: T259855
Change-Id: I6419408c6194ec0afa6b8ee604b12c1a24c6ac7b
2020-08-13 20:19:29 +02:00