Commit graph

4870 commits

Author SHA1 Message Date
Tim Starling 0cc211d675 Make DataParsoid be a real class
Use @property to provide the types of undeclared variables to Phan and
PHPStorm, as in my NodeData patch. Declare $dp->tmp since it is
commonly used and does not affect the JSON serialized output since it is
always stripped.

I omitted the constructor, instead of following the suggestion in the
massageLoadedDataParsoid doc comment which proposed injesting a
JSON-like data structure in the constructor. I thought it would be more
efficient to have the initial property assignments inline in the calling
code. This means breaking up many object cast expressions into
individual assignments.

In IncludeOnly, the coalescing null operator was only handling the case
where $start->dataAttribs was unset, which seems unlikely. I made it so
that it checks whether $start->dataAttribs->tsr is unset.

I added strongly typed clone() methods, to preserve type information for
static analysis.

DataParsoid is the type of the data in both the DOM and in tokens. To
simplify the changes to the Token hierarchy, I removed the duplicate
definitions of the public properties $attribs and $dataAttribs.

Change-Id: I16172083e7e9bcb94601d1d6862d1d202a7e3660
2021-10-13 10:20:15 +11:00
C. Scott Ananian 30cfb7c05a Rename deprecated usage of ParserOutput::{get,set}Property()
Bug: T287216
Depends-On: Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa
Change-Id: Id4581c6c45f9fc4690900a30d8172951bc461a1b
2021-10-08 10:12:05 -04:00
Translation updater bot 4c55bcbd94 Localisation updates from https://translatewiki.net.
Change-Id: I15bdf21a6913116a26f0db2d35c41bff9189b8d2
2021-10-06 09:28:06 +02:00
libraryupgrader 33e2ff79b5 build: Updating npm dependencies
* @wdio/mocha-framework: 7.4.6 → 7.13.2
  * https://npmjs.com/advisories/5197 (CVE-2021-3807)
* ansi-regex: 5.0.0 → 5.0.1
  * https://npmjs.com/advisories/5197 (CVE-2021-3807)

Additional changes:
* composer.json: Updated phpcs command in composer test (T280592).
* composer.json: Added phpcs command to scripts (T280592).

Change-Id: I9aa26cf3664857fac671dc15718e5341798625d2
2021-10-04 13:21:21 +00:00
Translation updater bot 00c1a28db6 Localisation updates from https://translatewiki.net.
Change-Id: I1e3932d60abf27cdab341530ce2f948b9dc65f98
2021-10-04 08:55:28 +02:00
Translation updater bot ca7e992234 Localisation updates from https://translatewiki.net.
Change-Id: Id817e900b7a78973f3b3f99bbd366668d8ec3513
2021-10-01 09:29:40 +02:00
Subramanya Sastry ed59e2ac38 Sync up with Parsoid citeParserTests.txt
This now aligns with Parsoid commit 29f8e7051529ecbb62fc52bff6726a4df8bf20c2

Change-Id: I165ee24e1b78bdf181fa45430fdec1549310c359
2021-09-30 15:00:56 -05:00
WMDE-Fisch a8ea95de72 Fix class doc block for VE action
Seems to be a copy&paste leftover.

Change-Id: Ie5c06dd0d880663a0a4f1bfeb082a106811f03e7
2021-09-30 10:13:39 +00:00
Translation updater bot 79fc5b898e Localisation updates from https://translatewiki.net.
Change-Id: Id4eae6b5d47b9657088d5766c23b09ef534462b2
2021-09-30 08:57:12 +02:00
Translation updater bot 8ad00930a0 Localisation updates from https://translatewiki.net.
Change-Id: I44760079ed1dc6e51c65ccae2454f75c7643f629
2021-09-29 09:01:33 +02:00
jenkins-bot ecf603a5fd Merge "Make citation dialog behave more like VE" 2021-09-28 09:14:05 +00:00
Translation updater bot c62d362b61 Localisation updates from https://translatewiki.net.
Change-Id: I5e52490d75ef5ad89f7ea670716009ac0015eb88
2021-09-28 08:18:31 +02:00
Thiemo Kreuz 8902cea828 Use .containsValuableData() method from transclusion model
This is even more reliable because it also considers auto-values,
for example.

Depends-On: I522b888e366f066b28983a18041a8728d11623df
Change-Id: If83b9da65be9a759a82e8512ae171f802da9f597
2021-09-27 11:19:28 +02:00
Translation updater bot 6d512c7667 Localisation updates from https://translatewiki.net.
Change-Id: I4c470c517972cbe8322cddf93d99ba1df24f689d
2021-09-27 08:34:25 +02:00
Translation updater bot 3d14044c4e Localisation updates from https://translatewiki.net.
Change-Id: Ie8333fbd6f24dfb6422229a0648788f9ddc241f8
2021-09-23 12:27:21 +02:00
Umherirrender c0ace40aaa build: Updating mediawiki/mediawiki-phan-config to 0.11.0
Change-Id: Ifb2eec4e791fd0de0a50d8ef85e0947ab9a891e7
2021-09-21 12:13:36 -05:00
Subramanya Sastry f7bc278673 DOMUtils: Get rid of isElt, isText, isComment helpers
* Most of these are remnants from the Parsoid/JS codebase.
* This change follows the pattern we've been using everywhere
  since the port from JS->PHP.
* Also reduces instruction count by about 0.2%.

Change-Id: Ibf21104f6722c34299f03e303dc3401bf053a751
2021-09-20 22:39:38 +00:00
Adam Wight ca2ffa4853 Make citation dialog behave more like VE
Applies new sidebar features to the Cite dialog, according to the VE
feature flags.

Bug: T291241
Change-Id: I1b7c191ae8fd1fa01808ea1e84ba72551f3d2331
2021-09-20 11:44:51 +02:00
Translation updater bot 408786d3ca Localisation updates from https://translatewiki.net.
Change-Id: I10a1fb9928638e0f554b45f850a6ec67912ca5e8
2021-09-17 08:45:09 +02:00
Translation updater bot 01acd4f775 Localisation updates from https://translatewiki.net.
Change-Id: I8ad6037c5b00e14e4a825db085752a1ff42c6968
2021-09-16 08:11:25 +02:00
Translation updater bot 0938867c6d Localisation updates from https://translatewiki.net.
Change-Id: Iff6f56cb0848ea3690131c6385e4735e4c0298a4
2021-09-14 08:32:28 +02:00
Tim Starling 8f3369b090 Avoid using regexes
Review regex usage, and use an alternative where possible, to improve
performance.

* Add PHPUtils::stripPrefix() and PHPUtils::stripSuffix(). Benchmark in
  doc comment.
* /foo/ -> str_contains()
* /^foo/ -> str_starts_with()
* /^f/ -> ($s[0] ?? '') === 'f'
* /foo$/ -> str_ends_with()
* /^(foo|bar)$/ -> in_array(), benchmark suggests 10x improvement
* preg_replace(/foo/) -> str_replace()
* preg_replace(/^[abc]/) -> strspn(), benchmark suggests 3x improvement.
  Curiously, it is faster without a limit for short input strings,
  although a limit presumably adds robustness.
* preg_replace(/[abc]+$/) -> rtrim()
* preg_match_all() -> substr_count()
* In DOMUtils::hasTypeOf(), use explode() instead of a regex. Validated
  by a benchmark.
* In DOMUtils::addTypeOf(), stop normalizing adjacent spaces. This
  allows us to use implode(explode()) without a filtering loop. The
  patch to Ext/Cite/References.php was to remove spaces added by this
  change. The parserTests.txt changes were a consequence of the
  References.php change.
* In LinkHandlerUtils::getHref() I allowed a single bare slash to be
  counted as a path-absolute URL since I think that was the intention of
  the original code.
* In LinkHandlerUtils::getLinkRoundTripData() I captured the portion of
  interest from the previoulinkHandlers regex instead of running a
  second regex.
* LinkHandlerUtils::linkHandler() had the regex
  /^mw:WikiLink|mw:MediaLink$/ which I think was a bug, missing
  parentheses. I fixed the bug.

The margins are pretty tight for a lot of these. Using polyfills for
str_contains() etc. might change the conclusion.

Also:

* In DOMUtils::matchTypeOf(), avoid calling hasAttribute().
  getAttribute() is documented as returning an empty string if the
  attribute does not exist.

Change-Id: I8d7bdf1bccc869b4dc17058a5822ef34968471e6
2021-09-13 23:01:45 +00:00
Translation updater bot c72760da66 Localisation updates from https://translatewiki.net.
Change-Id: I6a165a1d30035e3c8e00c5147d86bb34442c7dd7
2021-09-10 08:15:19 +02:00
Translation updater bot 40aa379b43 Localisation updates from https://translatewiki.net.
Change-Id: I2a537a3cb9be4d7f09889404634115e561845d56
2021-09-09 08:17:17 +02:00
libraryupgrader c4872e1e8f build: Updating composer dependencies
* mediawiki/mediawiki-phan-config: 0.10.6 → 0.11.0
* php-parallel-lint/php-parallel-lint: 1.3.0 → 1.3.1

Change-Id: I89eb4438dacb2d0ab36f4986c0675eec5cb61731
2021-09-08 19:30:00 +00:00
Translation updater bot efd8590420 Localisation updates from https://translatewiki.net.
Change-Id: I38cfab8051a01449caab2253b4b3fc9ef3ff484e
2021-09-06 08:20:27 +02:00
libraryupgrader 13677c06a6 build: Updating stylelint-config-wikimedia to 0.11.1
Change-Id: I0e1764c11a81c67c40428004bfd0e5ce873e8f69
2021-09-04 18:50:51 +00:00
Translation updater bot bfb49e4204 Localisation updates from https://translatewiki.net.
Change-Id: I298a4e0a11fe8f0f9f45626ae3dc06e56aec794c
2021-09-03 08:14:47 +02:00
Translation updater bot a4e65b96ba Localisation updates from https://translatewiki.net.
Change-Id: I3dbbd6824a2ca5b0449574ce95e02267b8e60c6f
2021-09-02 08:26:01 +02:00
jenkins-bot 6f2c6d4ef8 Merge "Show empty reflist message on initial load and after switching too" 2021-09-01 16:48:10 +00:00
Translation updater bot 7dae290d78 Localisation updates from https://translatewiki.net.
Change-Id: I2edb0e08ab9339b2f55513b2c0c5de03371a7e7f
2021-08-31 08:15:29 +02:00
Arlo Breault 66355c1ddc Migrate out valid follow contents after processing refs
Follow up to 47dd898

Also renames a variable to be consistent in the two places we get
contents for the ref.

Change-Id: I13e61b8911ff16549fbb0888b9c3313ed5e7701e
2021-08-27 15:00:54 -04:00
Arlo Breault 5c7c37e0c9 Reserialize processed refs if content differs
Follow up to 47dd898

Fixes the test case found in rt,
php bin/parse.php --domain ceb.wikipedia.org --pageName "Martin Van Buren" --offsetType ucs2 < /dev/null

The offsetType is necessary so that the ConvertOffsets pass runs.  The
crasher here is because the embedded html still contains the sealed ref
fragments because we've stored the unprocessed html.

Change-Id: Ic1e1c3e54433bf6d7574420c2eade1349261de0b
2021-08-27 15:00:37 -04:00
Subramanya Sastry 0d26fd19d5 Cite: Rename functions pushing/popping embedded content flags
Change-Id: Ie8736fcc139caba467209b7ba57daaa8f53bc18a
2021-08-26 11:43:52 -05:00
Arlo Breault 47dd8989a7 Don't process ref-in-ref as embedded, unless content differs
Restores linkbacks for ref-in-ref.

Follow up to 568034a where it's noted that it's fine to maintain
linkbacks for ref-in-ref, as long as the ref isn't a named ref that's
trying to redefine the contents for that name, in which case we embed
the contents.

A test case for this can be,

```
<ref name="hiho">off to work</ref>
{{#tag:ref|<i>we go <ref name="ohno">ohno</ref></i>|name="hiho"}}
{{#tag:ref|<i>we go <ref name="ohno2">ohno2</ref></i>|name="test"}}
```

The linkback to #cite_ref-ohno2_3-0 is present while continuing to
suppress the dangling linkback to #cite_ref-ohno_2-0, since that's in
embedded content.  On master, both linkbacks are unnecessarily
suppressed.

Bug: T289331
Change-Id: Ifcf7464e86a4408f5dd9e2fd6d3aa47a0670ca49
2021-08-26 16:41:02 +00:00
Arlo Breault d0e1637d22 Move content differ check up higher
This will be helpful in a subsequent patch where we make use of that
data while processing refs in refs.  Content differing implies that
we'll be embedding it for roundtripping, rather than putting in the dom.

Change-Id: I7bd1d4c503fc58f862960bec82ca514fc29d7eff
2021-08-26 16:38:58 +00:00
Arlo Breault 50dfe518cc Only call ReferencesData::add when adding
This moves determining if we already have a reference created for a
named ref outside of that function, which is helpful for making use of
the cached html for that ref earlier.

Change-Id: Ie416bd95b980f9f95111d7e420945f40e2ada747
2021-08-26 16:37:36 +00:00
Translation updater bot b16f0d9e68 Localisation updates from https://translatewiki.net.
Change-Id: I99a39c2d7f759d03aaa890fa1365f7deab5c0d6f
2021-08-26 08:35:40 +02:00
Bartosz Dziewoński 28a8739ce5 Show empty reflist message on initial load and after switching too
The message was only shown when a new reflist was inserted, or when
any references were changed.

Bug: T284472
Change-Id: I7c1e981c93bf7e163f9fb747aad30a24e9a497f1
2021-08-24 12:24:07 +02:00
Translation updater bot 3705438aa0 Localisation updates from https://translatewiki.net.
Change-Id: I3d7ab035b924f402c627da045790cd98c03f66d2
2021-08-23 09:03:41 +02:00
Translation updater bot 66125caae0 Localisation updates from https://translatewiki.net.
Change-Id: I633634a17f619b770828aebe952a8b4b922fbeb7
2021-08-19 08:13:13 +02:00
vladshapik 71c4bd1f98 Adjust Parser related tests to DeprecationHelper
There is the patch(I4297aea3489bb66c98c664da2332584c27793bfa) which will
add DeprecationHelper trait to Parser class in order to deprecate public
Parser::mUser. DeprecationHelper trait has appropriate magic methods
which help to use dynamic properties. In order not to mock them via
createMock(), so getMockBuilder() and onlyMethods() was used.
onlyMethods() method helps to specify methods which need to be mocked.
Now we can use dynamic properties in Parser related tests of Cite
extension.

Bug: T285713
Change-Id: Ie75c9cd66d296ce7cf15432e2093817e18004443
2021-08-17 14:14:55 +00:00
Translation updater bot ace7aff1db Localisation updates from https://translatewiki.net.
Change-Id: Id674072833a1dff7c4f15876a96109eb6c232992
2021-08-16 08:16:00 +02:00
Translation updater bot bdba8e61f7 Localisation updates from https://translatewiki.net.
Change-Id: I23d8eb3a038fe69608728244de6c0b20e1fd82d6
2021-08-12 08:09:25 +02:00
Translation updater bot 095d3bf6a9 Localisation updates from https://translatewiki.net.
Change-Id: I56527e34a9a4bf2ef75f5e26a0b075d1253d31b9
2021-08-11 08:19:06 +02:00
Translation updater bot 60b3124734 Localisation updates from https://translatewiki.net.
Change-Id: I43fbe0ee2818205c821ecb892a86e0f019e90891
2021-08-09 08:23:37 +02:00
Translation updater bot 4c6ef3c88d Localisation updates from https://translatewiki.net.
Change-Id: I5d351fce9a72cbc86af5b886c11f4db907e30fe7
2021-08-05 08:21:55 +02:00
Translation updater bot ba7bfbd2a5 Localisation updates from https://translatewiki.net.
Change-Id: Ia05fa0aa665dbf29f72480f844c62264bd56029c
2021-08-04 08:14:41 +02:00
C. Scott Ananian 187de4b769 The ::querySelectorAll() and ::getElementsBy* helpers don't always return array
The standard type for these returns is NodeList and HTMLCollection, which
are almost *but not quite* the same as an array.  In two places we got a
little complacent and assumed our non-standard DOMCompat workarounds would
always return arrays.  Tweaked the types of DOMCompat to report that they
return an `iterable`, which is a PHP7.1 "pseudo-type" that unifies
arrays and \Traversable types like HTMLCollection/NodeList.  This
allows phan to catch places where we slip up and assume an array type
return.

It does introduce a new wrinkle, though, since there is no simple way
to turn an iterable into an array.  We're using a simple
`iterable_to_array` helper function for this.

Change-Id: I35bdeb3afa30ef5182e71733a0a606aadcafb435
2021-07-31 03:50:07 +00:00
C. Scott Ananian a1d0fdd776 Allow Node::getAttribute() to return null
In PHP's DOM extension, one of the legacy bugs is that
DOMNode::getAttribute() can never return `null` (to indicate that the
attribute is missing), instead it returns an empty string in that
case.  This isn't (modern) spec-compliant behavior (it's a leftover
from ancient times) and we had to watch this carefully when porting
from JS.

In the time since the port, we've written new code and embedded this
assumption that DOMNode::getAttribute() will never return null into
the new code we've written.  Fix this.  Always use `getAttribute(...)
?? ''` (unless we're just doing an equality test against a non-empty
string, or the code is preceded by a `hasAttribute` test) so that our
code will work whether or not getAttribute returns null for a missing
attribute.

Change-Id: If33200e1053b2dd79abb5dfb3808c05ff3a0bbba
2021-07-30 20:34:47 +00:00