Commit graph

138 commits

Author SHA1 Message Date
jenkins-bot 636ca06e7e Merge "Fix parsing links in Parsoid documents without short URLs" 2020-09-17 20:40:40 +00:00
Bartosz Dziewoński 329df8c953 Parsing discussions converted to language variants
* Export parser data (date format, digits, timezone names, and
  messages for weekday/month names) converted to language variants
* Update the parsers to try matching using every variant, in case
  the page is displayed in non-default variant (and to avoid
  problems with incomplete variant conversion)

Bug: T259818
Change-Id: I04d73992cd31ce06fa79f87df0c0a53d7efc3c58
2020-09-16 22:07:07 +00:00
jenkins-bot abfbc3cf95 Merge "ApiDiscussionToolsEdit: Use PARAM_TYPE => 'string' for single-line textfields" 2020-09-16 19:11:41 +00:00
jenkins-bot 55a4e376fe Merge "Documentation fix" 2020-09-14 23:46:08 +00:00
Ed Sanders 92a8ca3469 Documentation fix
Change-Id: Ic37bb713bed8af7390a3e8be7ea0203b4687ce0e
2020-09-15 01:38:54 +02:00
Bartosz Dziewoński 069ea8952d ApiDiscussionToolsEdit: Use PARAM_TYPE => 'string' for single-line textfields
This only affects the interface on Special:ApiSandbox.

Change-Id: I6e3524cf59b7b6044a4efb66d5833636e5c60adc
2020-09-15 00:44:22 +02:00
jenkins-bot c02f54a848 Merge "Edit summary in advanced mode" 2020-09-14 22:29:59 +00:00
Ed Sanders ebaa132ebf Edit summary in advanced mode
Bug: T249391
Change-Id: Ibfd1cf77293d30b8e0d9869bbdc9aae2aa830a06
2020-09-14 22:29:54 +01:00
jenkins-bot 97e8935ce6 Merge "Create preference for turning off reply tool once out of beta" 2020-09-12 19:18:26 +00:00
Ed Sanders 38b5a9c0fe Create preference for turning off reply tool once out of beta
Bug: T259943
Depends-On: I75d38560a649794effbfc07cab1de0cc2e18f57b
Change-Id: Iec472bddd92e821d924a18db1734e8da6188b3dd
2020-09-11 22:16:57 +00:00
Ed Sanders d4f67918b2 Skip over whitespace when looking for trailing comments
Bug: T257651
Change-Id: Icce377f1833b80bd066622d6be3e711a18c58eea
2020-09-11 15:37:09 +01:00
Ed Sanders f96678d562 Factor out availability check
Change-Id: Ibc3e43ea50b7a38cf36f3ee2e479701d4489182e
2020-09-10 15:25:49 +01:00
C. Scott Ananian 965f317015 Use Language::formatNumNoSeparators where appropriate
Avoids using the deprecated $noSeparators parameter to Language::formatNum
in favor of Language::formatNumNoSeparators, which has been around since
MW 1.21.

Change-Id: I012434d5f6c749fec45a6c160e8d5d03686192e9
2020-09-09 18:15:08 -04:00
Ed Sanders 2cc39f2b84 Add topic API
Bug: T259563
Change-Id: I93bb93c272f9b30df81f7f978a3952ca1d027fec
2020-09-09 20:26:41 +00:00
Bartosz Dziewoński 14fb013515 Match handling of "signature scan limit" between JS and PHP
PHP was counting UTF-8 bytes, JS was counting UTF-16 bytes.
Both should have been counting codepoints (although it doesn't
really matter as long as they both count the same things).

I noticed the issue after adding some tests using the Cyrillic
script, when one case had different results in PHP and JS:
Id25b537fecd789640c209ff7f30e777455a3aece.

Change-Id: Ic31240678f71ba48e6ec202126bf490cea12bb66
2020-09-08 03:27:01 +02:00
Bartosz Dziewoński 10899af666 Fix parsing links in Parsoid documents without short URLs
Move the code so that we check for "?title=" query parameter first,
because we don't handle this right in the other code path.

Use parse_url() instead of wfParseUrl() because the latter doesn't
accept relative URLs, and we don't care about the other differences.

Bug: T261711
Depends-On: I4da952876e1c3d1a41d06b51f7e26015ff5e34d7
Change-Id: I70fac2b41befd782b0a47a4f726ae748dc0f775d
2020-09-02 23:42:37 +02:00
Bartosz Dziewoński 2d3fe47ac1 Fix parsing localised digits in PHP discussion parser
The PHP code incorrectly assumed that the digits are single-byte in
UTF-8, which is never the case (except for 0-9).

The JS code worked correctly because it uses UTF-16 strings, so the
bug would only affect non-BMP digits there. This was noted in a TODO
comment, but we overlooked it when reimplementing in PHP.

Instead of a string of 10 characters, use an array of 10
single-character strings.

Bug: T261706
Change-Id: Ic5421382474c88f003424799c53ff473d99cce92
2020-09-01 01:50:33 +02:00
jenkins-bot f167bfc5b1 Merge "Don't trust RESTBase to give us the correct revision" 2020-08-26 20:27:22 +00:00
jenkins-bot bc80b51fcd Merge "Fix typo in API help message key" 2020-08-25 13:28:55 +00:00
Ed Sanders d3205b877f Fix typo in API help message key
Change-Id: Idb4e973f4adb3affa857f99680707e35676810fa
2020-08-25 13:24:31 +01:00
Ed Sanders 17ba57edd6 Don't trust RESTBase to give us the correct revision
As we do in VE, extract the revid from the document.

Unlike in VE we don't need to throw an error if there is
a mis-match, as we will likely be able to make the edit anyway.
Just use the ID we got from the document.

Log a warning if there is ever an ID mis-match so we
can evaluate if this check is actually needed.

Change-Id: I94c37980524a9faabac49495903a5262387af562
2020-08-24 17:24:50 +01:00
Bartosz Dziewoński e36dc8e78a Skip to the end of the paragraph in the parser, not modifier
When a comment ended before the end of a paragraph, the next
comment would begin right there in the middle of the paragraph.
This could result in the detected indentation level of that
comment being incorrect, and replies being inserted in wrong
places, as seen in the 'signatures-funny' test case.

The code moved to the parser was previously repeated twice in
addListItem() and addReplyLink(), which should have been a hint
that something isn't quite right.

Also, fix the code guarding against overlapping signatures,
now that signatures may not be at the end of a comment.

Bug: T260855
Change-Id: Ic26a87642f8a15d5de2f7073d4d8176b299c7f94
2020-08-20 19:35:55 +00:00
Bartosz Dziewoński 84cb9d1dca parser: Code quality tweaks
Do things in a more intuitive order, avoid some repetition,
rename a vaguely named variable.

Change-Id: Ic1a0bb54134682eaf126231e04eb67847d6a5da6
2020-08-20 20:52:42 +02:00
Bartosz Dziewoński b706eac8bc Re-apply new reply API patches (again)
This reverts commit 4d7c98b97c.

Change-Id: I4100521efb687ec324d25e273a9c986fd5dac0d0
2020-08-19 20:05:42 +00:00
Bartosz Dziewoński 4d7c98b97c Revert new reply API (again)
Causes page corruption, in a new way we haven't seen before.

* Revert "Move page updating logic to controller.js"
  This reverts commit 54fdc6de06.
* Revert "ReplyWidget: Move clear methods from #teardown to #clear"
  This reverts commit 9b811a94e0.
* Revert "ApiDiscussionToolsEdit: Do not pass 'basetimestamp'"
  This reverts commit 7de5938a6f.
* Revert "Use DOMCompat::getOuterHTML instead of doc->saveHTML()"
  This reverts commit 7b2448d2f0.
* Revert "CommentController: Remove remains of client-side edit conflict handling"
  This reverts commit 2d038af705.
* Revert "Restore error message for when comment is deleted while replying"
  This reverts commit 655c0526d6.
* Revert "Use transcluded from API to avoid ever fetching Parsoid DOM in client"
  This reverts commit 9d0fc184fe.
* Revert "Create a 'transcludedfrom' API endpoint"
  This reverts commit 5d8f3b9051.
* Revert "Edit API for replies"
  This reverts commit 8829a1a412.

Bug: T259855
Change-Id: I6419408c6194ec0afa6b8ee604b12c1a24c6ac7b
2020-08-13 20:19:29 +02:00
Bartosz Dziewoński 375bfe028e parser: Fix comment ranges when timestamp has entities
Previously, parser would output offsets that don't exist in their
containers, because we were pretending that entities are parts of
their neighboring text nodes.

Turns out it's much easier to do it right when going backwards.

Change-Id: I9bccca2d403f1a976ae517449989170cdd99721e
2020-08-11 20:41:06 +02:00
Bartosz Dziewoński 7cd370615f Reindent CommentParser::findTimestamp()
Something terrible has happened to this function… It seems that I have
brutalized it when rebasing 092cfd6075.

Change-Id: I12d75c69d15645112563a7bc345209b23b54cb3e
2020-08-11 06:45:45 +02:00
jenkins-bot 4d4722a6ab Merge "Fix indentation level when replying to comments with mixed indentation" 2020-08-10 22:27:44 +00:00
jenkins-bot 7a18cc8902 Merge "Always use ':' (<dl><dd>) for indentation of replies" 2020-08-10 22:27:42 +00:00
Bartosz Dziewoński 7de5938a6f ApiDiscussionToolsEdit: Do not pass 'basetimestamp'
Only 'baserevid' should be required. That's what we used before commit
8829a1a412, since switching from
'basetimestamp' in commit 4e135c7f07 in
order to better handle edit conflicts with yourself. That fix seems to
have regressed, so let's try this and see if it helps.

Bug: T252558
Change-Id: Iff5911384f3320b6e7f97a1fa34e82ecd4b44fb3
2020-08-10 20:29:55 +02:00
jenkins-bot 2727abd36a Merge "ThreadItem: Fix "Notice: Undefined index: href"" 2020-08-07 20:25:33 +00:00
Ed Sanders 7b2448d2f0 Use DOMCompat::getOuterHTML instead of doc->saveHTML()
The latter results in lots of extra HTML entity encoding.

The former is built by the Parsing team and appears to result
in no unexpected changes elsewhere in the document.

As Parsoid's selser relies on HTML fragments being byte-for-byte
equal, these changes were resulting in wikitext normalisations
in untouched parts of the document ("dirty diffs").

Bug: T259855
Change-Id: Ib3cb605911e690ec3e8c2f9df25fd1a2e2849d7e
2020-08-07 21:31:38 +02:00
Esanders c26ca107db Re-apply new reply API patches
This reverts commit 96953647c3.

* Re-apply "Edit API for replies"
  This applies commit 8829a1a412.
* Re-apply "Create a 'transcludedfrom' API endpoint"
  This applies commit 5d8f3b9051.
* Re-apply "Use transcluded from API to avoid ever fetching Parsoid DOM in client"
  This applies commit 9d0fc184fe.
* Re-apply "Restore error message for when comment is deleted while replying"
  This applies commit 655c0526d6.

Change-Id: Id20d21899f87464636022aa0683f8c03e0060117
2020-08-07 21:31:38 +02:00
Bartosz Dziewoński 96953647c3 Revert new reply API
Causes page corruption.

* Revert "Restore error message for when comment is deleted while replying"
  This reverts commit 655c0526d6.
* Revert "Use transcluded from API to avoid ever fetching Parsoid DOM in client"
  This reverts commit 9d0fc184fe.
* Revert "Create a 'transcludedfrom' API endpoint"
  This reverts commit 5d8f3b9051.
* Revert "Edit API for replies"
  This reverts commit 8829a1a412.

Bug: T259855
Change-Id: I98036f14dd900b51f20e98696e31b9b618eceee1
2020-08-07 18:18:21 +02:00
Bartosz Dziewoński e7115b8b27 ThreadItem: Fix "Notice: Undefined index: href"
Bug: T259856
Change-Id: I10b16b7d79dc88d7259b64c7d5d7e4cc6860d0fc
2020-08-07 17:41:41 +02:00
Bartosz Dziewoński 31b26a5bec Fix indentation level when replying to comments with mixed indentation
When adding a reply, we take a node at the end of the previous comment,
compare that comment's indentation level to the expected indentation level
of the reply, and add (or remove) that number of wrapper lists.

The existing code did not consider that comments may have lists within
them, and so the indentation of that node may not match the indentation
of the comment.

Bug: T252702
Change-Id: Icc5ff19783d2b213bff99f283cb0599a8b5c1ab4
2020-08-06 01:25:33 +02:00
Bartosz Dziewoński a4ffdd37de Always use ':' (<dl><dd>) for indentation of replies
Previously we preferred that, but used '*' (<ul><li>) when the parent
comment or the previous reply also used it.

Bug: T252708
Change-Id: I3abf606da6693905764f1be745fad999fdf57fbe
2020-08-04 23:37:00 +02:00
Ed Sanders 5d8f3b9051 Create a 'transcludedfrom' API endpoint
Bug: T252558
Change-Id: I2220f26dc2b744f03feeee9124f6936858dda3eb
2020-07-27 21:20:01 +01:00
Ed Sanders 8829a1a412 Edit API for replies
Bug: T252558
Change-Id: Iac43b5bb0315ceaad7698fb2b708b7c3cde403f8
2020-07-27 21:19:58 +01:00
Bartosz Dziewoński 31e371e944 Better handle HTML comments following replies
Bug: T257651
Change-Id: I07e995beca4f031be062958ff7d75727afa8e606
2020-07-23 18:18:21 +02:00
jenkins-bot 765a1d27bc Merge "Improve detecting template-generated multi-line comments" 2020-07-22 15:04:00 +00:00
jenkins-bot 889de1bcdf Merge "Improve detecting typed signatures" 2020-07-22 01:43:40 +00:00
Bartosz Dziewoński 80e52e1155 Improve detecting typed signatures
* Remove the existing approach for detecting signatures that only
  worked in source mode; remove autoSignWikitext()

* Use the same approach for auto-signing in source mode as we have
  already used in visual

* In both modes, detect whether the user has already typed a signature
  at the end of their comment in the modifier, and if so, don't add a
  signature

* Add test cases for the detection

Bug: T255738
Change-Id: I791d3035cb1ffc33ce3966d4617a25d08700c35b
2020-07-22 00:00:53 +02:00
Bartosz Dziewoński 569db3603c Improve detecting template-generated multi-line comments
Bug: T252058
Change-Id: Ic010b8aeff9b177031184f02f92fcdea5280dc36
2020-07-21 22:26:45 +01:00
Ed Sanders a2431fe006 Refactor CommentParser
* Pass rootNode to the constructor
* Rename getters to match CommentItem/HeadingItem/ThreadItem
  value classes.
* Always build the thread tree so CommentItem's always have
  and ID and replies/parent.

Change-Id: I508be9534de59016ff806e3d84edcbb1c76cb0c6
2020-07-20 23:38:10 +01:00
Ed Sanders a4636d39fc Move #getTranscludedFrom from parser to ThreadItem
Also requires moving getTitleFromUrl to CommentUtils

Change-Id: I9cb83a3fdd456eba66899433b866ce7a7f00eeb5
2020-07-20 15:56:48 +01:00
Ed Sanders 7ae5bbf384 Move #getAuthors from parser to ThreadItem
Change-Id: I16e513000e5366b3044b17a99da07d8d0f47a61f
2020-07-20 15:13:59 +01:00
Ed Sanders b32f991913 Documentation fixes
Change-Id: I2c7ccecbf8a50bd4d658b0f17f4a21fe90a3c399
2020-07-20 13:34:08 +01:00
Ed Sanders 092cfd6075 Parser: Replace findTimestamps with findTimestamp
Instead of doing a separate tree walk and finding all timestamps
separately, make it part of the getComments tree walk, and find
timestamps one at a time.

Change-Id: I47f466eaf228504faa189fd99e07493bc7f022cd
2020-07-15 21:34:22 +02:00
Bartosz Dziewoński 308c2747b0 CommentParser.php: Use tree walking instead of XPath
This is similar to what the JS version does.

The TreeWalker and NodeFilter classes are adapted from
https://github.com/Krinkle/dom-TreeWalker-polyfill
(MIT license).

This makes #getComments twice as fast on en-big-oldparser.html

Change-Id: I2441f33e6e7bad753ac830d277e6a2e81ee8c93d
2020-07-15 16:40:50 +00:00