wikimedia/mediawiki-extensions-DiscussionTools - fanwikis.org Git Server

wikimedia/mediawiki-extensions-DiscussionTools

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/DiscussionTools synced 2024-12-01 03:26:28 +00:00

Author	SHA1	Message	Date
jenkins-bot	048d5364e2	Merge "Replace preg_replace_callback with strtr in CommentParser"	2023-10-31 13:35:19 +00:00
thiemowmde	10dcd1f847	Replace preg_replace_callback with strtr in CommentParser It does the same as before. I think performance is not a concern here, and wasn't my motivation either. But I hope this makes the code easier to read and to reason with. I added a pure unit test case (without involving an actual Language object) to cover the previously uncovered digits feature. Change-Id: I6a0fc86035817eabb42b55e58183ae094c052aa6	2023-10-31 08:55:40 +01:00
thiemowmde	1491b47b12	Improve performance of CommentParser::getUsernameFromLink I was curious why running the CommentParserTest takes so long. I found this is one of the bottlenecks because it's called so often, but many link titles that are parsed as user names turn out to be something else. This little hack speeds up the test by 15% and has probably a similar impact in production scenarios. Change-Id: I5a0b3a49ba5793c8a345baaa7118fed500c082b6	2023-10-30 17:59:46 +01:00
Theodore Dubois	4ca17b8c33	Support ISO 8601 timestamps in the parser https://wikipesija.org is currently using ISO 8601 as the default date format. The format is xnY-xnm-xnd"T"xnH:xni:xns and 'xn', 'm', and 's' need support added. Change-Id: I235098a578eb92ddd23ea47fa23d60df4b28f590	2023-06-17 11:36:43 -07:00
Ed Sanders	92f5cfd821	Support suppressing comment detection in pages or sections This can be done within sections using CSS: * mw-notalk Or at a page level using a magic word: * __NOTALK__ "notalk" suppresses all comment detection, treating the content as not containing any comments even if there are signatures present. Bug: T295553 Bug: T249293 Change-Id: Ic1d7294bafcf7071e16838e70684ecadd7bc6fd3	2023-04-03 18:36:34 +02:00
Ed Sanders	2fcc505d50	Parser: Store timestamp ranges Change-Id: Ifcbe22011f11f4374f38b7aa346da5a96cac968c	2023-03-28 23:51:17 +00:00
Ed Sanders	b82af45735	CommentParser: Output display name if different to username The only normalisation we apply for comparison is lowercasing. Change-Id: Id3d57c2066429fcedc7dcc091e74ed46e17060f1	2023-02-23 23:03:32 +00:00
Bartosz Dziewoński	3a9997d6ea	Improve handling for comment separators * Detect comment separators at the end of comments too * Consider TemplateStyles associated with ignored templates This unexpectedly improves a lot of cases other than T313097 too, mostly where <br> or {{outdent}} was used within a paragraph: splitting comments that were previously jumbled together, or restoring content that was previously ignored for apps / notifications. Bug: T313097 Change-Id: I9b2ef6b760f2ffd97141ad7000f70919aeab7803	2023-01-10 01:59:52 +00:00
Ed Sanders	e24550fae9	Refactor thread summary getters Replace getThreadSummary with individual getters that call calculateThreadSummary once. Change-Id: Ie8a8b4d7cb5121847b78dbc20bca2c8d48c7d857	2022-09-06 23:19:13 +02:00
Ed Sanders	664d5d041a	Fix fetching of oldest comment in a thread The implementation in Parser doesn't descend into sub-thread. Re-use the getThreadSummary method in ThreadItem and traverse the thread properly. Bug: T298617 Change-Id: I318d9012eb83f37ccbe463923524ef2e9f995ced	2022-09-01 21:22:09 +00:00
Ed Sanders	0ad9b4c6b2	Move placeholder heading level (99) to a constant Change the HeadingItem constructor to take a 'null' headingLevel and store this internally with the constant. Change the JSON serializer to convert this back to null. Change-Id: I27508eed75d94b99c5189548919309f8da7deb75	2022-06-14 22:51:49 +01:00
Bartosz Dziewoński	6a59149132	Ignore LRM and RLM in more places in the timestamp We previously ignored them before timezone indicator (`e9c401e3aa`), but they can end up in other places too, e.g. after the time. Now we ignore them after every token. This is way overkill, but it shouldn't hurt. Bug: T308448 Change-Id: I20f7aaa34dba23f2a2faf1be258c1aea32ab770f	2022-05-17 02:00:22 +02:00
Ed Sanders	579b8bb1d4	Implement getTimestampString on CommentItem Change-Id: I1768e9993debe904d6a228942ad0188486d65c0b	2022-03-24 16:49:35 +00:00
Bartosz Dziewoński	c7723baf72	CommentParser: Replace uses of Title with TitleValue Another small step towards removing the reliance on global state. Change-Id: Ifb4a5bcbef6606d02f1c7aa7385d72822cb0bad0	2022-03-18 18:24:34 +00:00
jenkins-bot	32d9ef573a	Merge "CommentParser: Avoid using a dynamic undeclared property"	2022-03-10 00:22:16 +00:00
jenkins-bot	76478dda26	Merge "Move signatureScanLimit to a constant in JS"	2022-03-10 00:22:14 +00:00
Bartosz Dziewoński	4c29304484	CommentParser: Avoid using a dynamic undeclared property Change-Id: Iefa8dea83bc0d31b9c6b3509189eeaa652dd9ea0	2022-03-08 23:30:11 +00:00
Bartosz Dziewoński	eb1fe7a8fb	CommentParser: Fix redundant uses of getHeadlineNodeAndOffset() We call CommentUtils::getHeadlineNodeAndOffset() before constructing the HeadingItem in CommentParser, so the range's startContainer is always the headline node. Change-Id: I2afb6ba9100e785cd91f31d82f4cea59fa8b5443	2022-03-08 23:29:34 +00:00
Bartosz Dziewoński	8a2715bdd5	Move signatureScanLimit to a constant in JS Change-Id: Ieb60c148fd060ab62e4a493e2d0dff6c051f945c	2022-02-21 22:42:14 +01:00
Bartosz Dziewoński	4244418e56	Don't detect comments within references Bug: T301213 Change-Id: Ifd5198651c8ed0ce53379fb5e35938089cd54a09	2022-02-21 19:57:44 +00:00
Bartosz Dziewoński	8e44b43df0	Split off ThreadItemSet from CommentParser Goal: ----- Finishing the work from Iadb7757debe000025e52770ca51ebcf24ca8ee66 by changing CommentParser::parse() to return a data object, instead of the whole parser. Changes: -------- ThreadItemSet.php: ThreadItemSet.js: * New data class to access the results of parsing a discussion. Most methods and properties are moved from CommentParser with no changes. CommentParser.php: Parser.js: * parse() returns a new ThreadItemSet. * Remove methods moved to ThreadItemSet. * Placeholder headings are generated slightly differently, as we process things in a different order. * Grouping threads and computing IDs/names is no longer lazy. We always needed IDs/names anyway. * computeId() explicitly uses a ThreadItemSet to check the existing IDs when de-duplicating. controller.js: * Move the code for turning some nodes annotated by CommentFormatter into a ThreadItemSet (previously a Parser) from controller#init to ThreadItemSet.static.newFromAnnotatedNodes, and rewrite it to handle assigning parents/replies and recalculating legacy IDs more nicely. * mw.dt.pageThreads is now a ThreadItemSet. Change-Id: I49bfe019aa460651447fd383f73eafa9d7180a92	2022-02-21 16:22:32 +00:00
Bartosz Dziewoński	4613ae78e7	Change CommentParser into a service Goal: ----- To have a method like CommentParser::parse(), which just takes a node to parse and a title and returns plain data, so that we don't need to keep track of the config to construct a CommentParser object (the required config like content language is provided by services) and we don't need to keep that object around after parsing. Changes: -------- CommentParser.php: * …is now a service. Constructor only takes services as arguments. The node and title are passed to a new parse() method. * parse() should return plain data, but I split this part to a separate patch for ease of review: I49bfe019aa460651447fd383f73eafa9d7180a92. * CommentParser still cheats and accesses global state in a few places, e.g. calling Title::makeTitleSafe or CommentUtils::getTitleFromUrl, so we can't turn its tests into true unit tests. This work is left for future commits. LanguageData.php: * …is now a service, instead of a static class. Parser.js: * …is not a real service, but it's changed to behave in a similar way. Constructor takes only the required config as argument, and node and title are instead passed to a new parse() method. CommentParserTest.php: parser.test.js: * Can be simplified, now that we don't need a useless node and title to test internal methods that don't use them. testUtils.js: * Can be simplified, now that we don't need to override internal ResourceLoader stuff just to change the parser config. Change-Id: Iadb7757debe000025e52770ca51ebcf24ca8ee66	2022-02-19 19:51:57 +01:00
Bartosz Dziewoński	99b5de8038	Split Data class into ResourceLoaderData and LanguageData The Data class contained utilities for two unrelated purposes. Split each half to a separate class. Notably, this improves the signature of the getLocalData() function. Change-Id: Icde615fb9d483fee1f352c34909b37f8ffde8081	2022-02-19 19:37:34 +01:00
Bartosz Dziewoński	ae9f26a9e5	Various code quality tweaks (suggested by PhpStorm) composer.json: * Document required PHP extensions Parser.js: * Remove incorrect param documentation * Fix some typos in comments (missing parentheses) CommentParser.php: * Fix some typos in comments (missing parentheses) ImmutableRange.php: * Remove unused property * Add a `throw` to indicate that code path is unreachable SubscribedNewCommentPresentationModel.php: * Add missing `return false` CommentParserTest.php: * Remove unnecessary pass-by-reference CommentModifierTest.php: * Remove unused variable CommentParserTest.php: * Don't construct Element objects directly. PHP's DOMElement allows it, but Parsoid/Dodo's doesn't, and we use the latter for static analysis. This generates all kinds of confusing warnings. Change-Id: Ia9598ebea0e99830dd485296e94a9d96acc4b258	2022-02-19 19:36:52 +01:00
Bartosz Dziewoński	13ab1db6da	Don't count leading/trailing whitespace against signature scan limit It's an arbitrary limit, it seems harmless to relax it to support the use case in the task, even if it's weird. Bug: T300949 Change-Id: I7c895c7019726758bbae3183b9c3ecbd9eabcf38	2022-02-04 19:35:29 +00:00
Ed Sanders	0b42aea276	CommentParser: Cache variables in getUsernameFromLink Change-Id: I625e6ded3badd75a7a658c8d000576d0d165a18b	2022-02-04 19:35:18 +00:00
Ed Sanders	8ad1df7dc8	CommentParser: Name parts of return value from findSignature Change-Id: I3a5ad36df0afdedc0aa9a15e5d83c5426b03b790	2022-02-04 19:34:18 +00:00
Ed Sanders	f80ff74fc6	Handle selflinks by returning the current page's title Bug: T287818 Change-Id: I67f10ac9976581279d1e6a477e90d55875ebab20	2022-01-12 21:18:04 +00:00
Ed Sanders	34011b7a07	Parser: Pass in title of page being parsed Will be used to parse selflinks in the future. Change-Id: I2bc29d1c5c69cb6309f582f162f9af7d96ce8913	2022-01-12 21:17:59 +00:00
Ed Sanders	2e1241289c	Better document {Object} types Change-Id: Ibfaf2ded443301c68552dbf98a1897a50bda9ef5	2021-12-20 17:25:54 +00:00
Bartosz Dziewoński	ef7274d69e	Move some helpers from CommentParser to CommentUtils Change-Id: I0e323d3b75f47459a5548a13e9684f4c6ff4ba0c	2021-12-13 17:13:41 +01:00
Ed Sanders	7c3e583bec	build: Update eslint-config-wikimedia to 0.21.0 Change-Id: I72de463d5a878e555eeed0e7ce2772e1d3a46f06	2021-11-08 19:03:40 +00:00
Ed Sanders	f4c12e120a	Define documentable types in eslintrc instead of inline These types can be passed a parameters to any file without creating a dependency, so it makes more sense to allow the globally. Change-Id: I5504465fd997b46547642e7046993b370b85586e	2021-10-17 14:38:39 +01:00
Bartosz Dziewoński	f35bf487ef	Take over extra links to add a new topic added by gadgets/templates * Move getTitleFromUrl() from parser to utils. It's a generic method, the PHP equivalent is already in utils. Bug: T277371 Change-Id: Id960e5f60af02bdeb0a3a68f43b7a695eb035139	2021-06-30 18:06:39 +02:00
Ed Sanders	1893405635	Code style: Move var declarations inline Change-Id: I1686603388b050ba4ec22eff23e4806cdf262b87	2021-04-22 17:43:46 +00:00
Bartosz Dziewoński	42ce942c86	Introduce comment "names" to identify comments across revisions/pages The existing comment IDs can't be used to find the same comment on a different revision or page (when it's transcluded), because they depend on the comment's parent and its position on the page. Comment names depend only on the author and timestamp. The trade-off is that they can't distinguish comments posted within the same minute, or in the same edit, so we will still need the IDs sometimes. Prefer using comment names when replying, if they're not ambiguous. This fixes T273413 and T275821. Heading names depend on the author and timestamp of the oldest comment. This way we don't have to detect changes to the heading text, but we can't distinguish headings without any comments. Bug: T274685 Bug: T273413 Bug: T275821 Change-Id: Id85c50ba38d1e532cec106708c077b908a3fcd49	2021-03-23 16:08:42 +00:00
Ed Sanders	4a0802065c	Make IDs (to be used as URL hashes) wikitext safe * Use hyphens instead of pipes a separators * Use underscores for spaces in usernames Change-Id: I6efd9739fc73e45002e50e64c43ce0de1c2f1239	2021-03-18 20:45:21 +01:00
Bartosz Dziewoński	f5059e6ea6	Don't detect comments within 'cite' elements too Follow-up to `024a978ffd`. Bug: T275881 Change-Id: I53448ad22cd0531e7fd4aa0ea5d15782879cce14	2021-03-01 21:40:43 +01:00
jenkins-bot	0eb37a87df	Merge "Don't detect comments within quotes"	2021-02-28 22:56:20 +00:00
Bartosz Dziewoński	024a978ffd	Don't detect comments within quotes Bug: T275881 Change-Id: I8f7a4279837bd95ebf5b604ff350c0a3f29c2c05	2021-02-28 22:49:48 +00:00
Bartosz Dziewoński	efe95494a8	Improve signature detection to handle formatting on the timestamp Now it detect signatures generated by en.wp's {{Undated}} template, and signatures of people who do weird stuff to the timestamps. Bug: T275938 Change-Id: I27b07f6786ca5433a3c02a5fe68e4716d41401bb	2021-02-27 02:33:30 +01:00
Bartosz Dziewoński	af082908a5	Improve merging multiple comments on one paragraph The horrendous 11-line if() condition did not correctly handle signatures wrapped in inline formatting markup, like <small>. Instead, implement this logic in the code for skipping to the end of a paragraph, which didn't exist yet when that condition was added, but seems like a much better place to check this now. Bug: T275934 Change-Id: I5cccff889b5e15b5f8fde0538bf4bccb22e762cf	2021-02-27 02:21:36 +01:00
Bartosz Dziewoński	35738b1f9b	CommentParser: Replace getThreadStartTimestamp with getThreadStartComment Change-Id: Ia8d878594306b5ce4039ca06d6dcec753e5dea28	2021-02-24 12:26:58 +00:00
Bartosz Dziewoński	1998c983f1	computeId() can't return null It used to return null for headings, but now it doesn't. Simplify some code checking for that. Change-Id: I28131c4aee89b901879b4c49953d6b15ed91b5e7	2021-02-13 00:08:15 +01:00
Ed Sanders	d05109b24d	Truncate user generated part of IDs to 80 characters This ensures that IDs fit in a 255 character database field. Bug: T273658 Change-Id: I3cfe4fce6a865b4343f0f01121cd696aa5f98b22	2021-02-03 15:04:58 +00:00
Ed Sanders	6c3dd3aaa9	Move Hooks to HookUtils Now that all the real hooks have been separated out Change-Id: Ibdb42f98614fc551068f8f8e5297dcc99251ab46	2021-02-01 22:35:11 +00:00
Bartosz Dziewoński	c781b127c9	Handle category links at ends of comments affecting indentation * Ignore rendering-transparent nodes between discussion comments. * Improve isRenderingTransparentNode() so that <link> nodes representing TemplateStyles are not considered transparent, otherwise this would undo `ae920b831f`. Using a regexp from Parsoid. Bug: T272746 Change-Id: I0b3c3251156ba6c4826abf5ba44ea93f80ebc01d	2021-01-26 04:55:03 +01:00
Bartosz Dziewoński	8f42c74985	Fix skipping to the end of paragraph, now it considers nested tags Add yet another tree walking utility: CommentUtils::linearWalk(). Unlike TreeWalker, it allows handling the beginnings and ends of nodes separately – kind of like parsing a XML token stream, or kind of like VisualEditor's linear model. (Add unit tests for this utility. The simple.html test case is copied from [VisualEditor/VisualEditor]/demos/ve/pages/simple.html.) Use this utility to stop skipping when we reach either a closing or opening block node tag. Previously we'd skip over such tags inside nested "transparent" nodes (like <a>, <del>, or apparently <font>). Bug: T271385 Change-Id: I201a942eb3a56335e84d94e150ec2c33f8b4f4e0	2021-01-18 18:20:20 +00:00
Bartosz Dziewoński	50ad5bb2b4	Ignore outdent templates at the beginning of comments Bug: T264116 Change-Id: Iae9dbb30a1aead897cc274f655d3ecff4b297dbd	2020-12-14 21:35:56 +01:00
Bartosz Dziewoński	ae920b831f	Change which nodes are ignored at the beginning of comments again While working on T270009, I noticed that <style> and <link> nodes are treated differently, which seemed weird. Rewrite this again, hopefully this is the last time. The changed test cases also involve <area> and <input> nodes, and the new results make more sense to me. Bug: T264116 Change-Id: I3af90c84768a4b3dc53446927f4dba6f72175a2f	2020-12-14 21:33:50 +01:00

1 2