mediawiki-extensions-Cite

mirror of https://gerrit.wikimedia.org/r/mediawiki/extensions/Cite synced 2024-12-18 09:40:49 +00:00

Author	SHA1	Message	Date
Arlo Breault	4310b6a243	Mark up cite errors in embedded content It's a feature of named refs that we only know at the time of inserting the references list whether they have content or not, and are therefore in err. The strategy of `4438a72` was to keep pointers to all named ref nodes so that if an error does occur, we can mark them up. The problem with embedded content is that, at the time when we find out about the errors, it's been serialized and stored, and so any pointers we might have kept around are no longer live or relevant. We need to go back and process all that embedded content again to find where the refs with errors are hiding. This patch slightly optimizes that by keeping a map of all the errors for refs in embedded content so that only one pass is necessary, rather than for each references list. Also note that, in the common case, this pass won't run since we won't have any errors in embedded content. Bug: T266356 Change-Id: I32e7bfa796cd4382c43b3b1d17b925dc97ce9f7f	2020-11-06 18:31:26 -05:00
Arlo Breault	c675396445	Fix adding 'cite_error_group_refs_without_references' to unnamed refs Follow up to `02fb17d`, which was only iterating over named refs. Bug: T51538 Change-Id: I1a1ce39029c2e9e6e29e768675bcde266ccf3247	2020-11-06 13:14:03 -05:00
Arlo Breault	049735ba0e	Clean up signatures of ref group accessors No need to hedge on null. Change-Id: I2afb7619a113d784741bd7d29eccf4d8368fe56f	2020-11-06 17:45:18 +00:00
Arlo Breault	0254f138ab	Suppress linkbacks for all refs in embedded content Not just for refs in references content, since they'll be equally inaccessible everywhere. Change-Id: Id0a2361b41d9b8103e011ff4f809fa0809169bb3	2020-11-06 17:45:16 +00:00
Arlo Breault	1dda4cdc8a	Consolidate adding ref errors at references insertion Change-Id: I01ce55989fb7b822320c63ddad19c2edf7e03bf9	2020-10-29 15:54:30 -04:00
Arlo Breault	8e237b4e34	Make $inEmbeddedContent an explicit stack Change-Id: I48ff2f7be352fdec72b2c5e0eeee843330ec3872	2020-10-23 11:42:45 -04:00
Arlo Breault	6bd0594f28	Don't keep pointers to nodes from embedded content Since the fragment they're subtrees of goes out of scope. Follow up to `2f09cdb` Previous to that patch this wasn't an issue because we were creating a whole document which is retained by the environment. Fixes the warnings from, "PHP Warning: DOMElement::getAttribute(): Couldn't fetch DOMElement" https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2020.10.02/parsoid-tests?id=AXTqaLL12lgCwKx7fVYz&_g=h@a06543d Tested on scandium with, node bin/roundtrip-test.js --proxyURL http://scandium.eqiad.wmnet:80 --parsoidURL http://DOMAIN/w/rest.php --domain vi.wikipedia.org "Vua_Việt_Nam" Change-Id: I74bc7de79b18054e19b77af25e978d3ab3a505e4	2020-10-02 15:57:33 -04:00
Arlo Breault	2f09cdb732	One document to rule them all The description in T179082 suggests that by using one document for the entire parse, we'd probably see some performance gains from not having to import nodes when we get to the top level pipeline and we'd avoid the validation errors from 19a9c3c. However, the spec seems to suggest creating a new document when parsing an HTML fragment, https://html.spec.whatwg.org/#html-fragment-parsing-algorithm And, indeed, domino implements it that way, `12a5f67136/lib/htmlelts.js (L84-L96)` So, the request in T217705 may be a little misguided. What then is this patch good for? In T221790 the ask is that sub-pipelines produce DocumentFragment which make for cleaner interfaces and less confusion when migrating children. The general outline here is that a document is created when the environment is constructed that gives us the 1-1 correspondence. Sub-pipelines do create their own documents for the purpose of tree building, as in the fragment parsing algorithm, but are then immediately imported to DocumentFragments to be used for the rest of the post-processing passes. Bug: T221790 Bug: T179082 Bug: T217705 Change-Id: Idf856d4e071d742ca38486c8ab402e39b3c8949f	2020-09-29 22:36:33 +00:00
Arlo Breault	bb34d30839	Follow up to "follow" functionality for Cite These refs get a `style="display: none;"` since they're not intended to be user visible. Follow refs with errors conform to the proposed spec in T251842 Bug: T51538 Change-Id: Ie4ea28e7f9afde24614874bb4b8e07c5cabafa12	2020-09-10 12:41:06 -04:00
sbailey	467b82701b	Adding "follow" functionality to the Cite extension * Interim state commit with experimental code. * Updates to citeParserTests.txt to check now valid follow functionality and newly passing tests. * Added to follow refs, <sup style="display: none;" about=... to suppress display of hidden sups needed for VE to use in editing follow refs. * Added code to implemented follow functionality and catch invalid usage. Bug: T51538 Change-Id: Ic3ac8237fd2c490cfaf2fe799759742f72f10686	2020-09-09 19:25:14 -04:00
Arlo Breault	d6bcc0ef14	Prefer nullable types in comments This was done with a custom sniff in, MediaWiki/Sniffs/Commenting/FunctionCommentSniff.php `$singleType === 'null' && count( $explodedType ) === 2` since there's some ambiguity with, `what\|type\|null` but also a case like the following is left out, `string[]\|null` Change-Id: I1bd50a4486d7ef4974280b476fd03d3ee53232b3	2020-07-29 14:24:32 -04:00
sbailey	4438a72297	Adding error handling for cite refs with name but no content * Detects grouped and named refs that fail to define content. * Uses group and name ref list tracking info to back patch 'mw:Error' and i18n error key string into the data-mw section of all instances of named refs that all fail to define content. * The failures for test References: 7b is because selser is arguable smarter than wt2wt. The newline before the references list has been randomly deleted but selser manages to restore it from source. wt2wt doesn't put the references tag on a line by itself, even though it asks for block format, because it isn't a new list - (these comments are from Arlo's review) * Added test: "References: 7b. Multiple references tags some with errors..." to ensure that refs with and without content errors grouped and named do not cross references section boundaries. Bug: T51538 Change-Id: I884fc337165506c5abbef18bcd5a5fca015786d2	2020-06-25 14:58:08 -04:00
Subramanya Sastry	0cc3ca1b98	Move DomSourceRange to Core; ParsoidExtensionApi to Ext * At this point, DSR is a first-class Parsoid concept and extensions will need to use this as well. So, make it part of the Core/ namespace to capture high-level concepts that might be used outside Parsoid itself. * Move ParsoidExtensionApi to the Ext directory since that is where it best belongs. Change-Id: If824c4af9e2f8d658f1cb726cbd837222b60790d	2020-03-16 15:52:08 +00:00
Subramanya Sastry	14d9ed27f0	Remove direct access to Sanitizer from extension code * Proxy all accesses to the santiizer via appropriately named methods in the ParsoidExtensionApi interface Bug: T242746 Change-Id: I9d3d98639bb98b4abe404139786517591323d61d	2020-02-20 23:23:22 -06:00
Subramanya Sastry	d0a9c42c98	Cite: Remove more Parsoid internals knowledge * Remove use of $env from ReferencesData and RefGroup by providing high-level helpers in ParsoidExtensionAPI. - Given a fragment id, provide helpers to fetch fragment DOM or fragment HTML - Fetch the URI for the current page (being parsed) * There is still a lot of subtle knowledge Cite has about how data-parsoid and data-mw attributes are held off to the side in a bag and all the pp* and load/store manipulation of those attributes. It would be an interesting exercise to purge this implementation of those notions OR figure out high-level concepts that we document as being part of Parsoid reality that we'll forever support. Bug: T242746 Change-Id: I29ff154f2f17123b9756dfd2f3b422f0b30222b1	2020-02-11 19:47:28 +00:00
C. Scott Ananian	5d200e0bf0	Move all code from Parsoid to Wikimedia\Parsoid namespace This matches core conventions. Bug: T240054 Change-Id: I5feb8a6b41503accd01a740195256e9092609272	2020-02-03 21:34:49 +00:00
Arlo Breault	e6204a1561	Test against ref name length instead of coercing to bool Since "0" is falsy in php. Couple tests now pass. Change-Id: I9b62b9f78680de6e1d5c31723af7212a58a535f3	2019-08-14 18:59:28 -04:00
Pavel Astakhov	005176a355	Port Cite extension * All wt2wt, html2wt, and all but one html2html tests pass in hybrid mode when entire html2wt code is run in PHP Set "Serializer: true" in the html2wt section of phpconfig.yaml * The single failing html2html test is a <gallery> test which is presumably related to the unported <gallery> extension code, but not sure. Not investigating it now. * Update Parsoid Extension API to provide access to extension source without exposing internals. Change-Id: I6d6e21ad2324acfc4306b32c9055d6c088708c48	2019-06-21 16:23:42 -05:00
C. Scott Ananian	320d045ee8	Update automatically-generated PHP files w/ latest js2php Mostly comment formatting improvements, some significant code changes to the JS side. Change-Id: I7a8f2105173df74dc09f2024d68268f5dc6fa632	2019-06-05 17:13:34 -04:00
Arlo Breault	05cb13ddf9	Make extensions with post-processors return constructors This allows us to finish the cleanup started in 0b3bb10 and inline setupProcessors. Change-Id: Ia7840091607e9a75153031b5db7600d5a0018da6	2019-04-03 18:44:21 +00:00
Arlo Breault	20c627e3f4	Convert cite extension to es6 class structure Also, runs js2php on these files. Change-Id: Id8ee13ad536d75f63e0045a21fdfdb667a0df65d	2019-04-03 12:20:41 -04:00

21 commits