Previously we checked the elements were reference-equal
which is fragile and breaks when linear data freezing is
enabled in debug mode.
Change-Id: Ifb0ba3caf8d3e5a67c9694358cac12cc412fe723
We added this line in order to make the sidebar show/hide behavior
the same as the new VisualEditor template dialog, but it should have
only affected this one button. When additional actions are added,
such as Citoid's "change reference type" button, these should still
be available.
Bug: T293280
Change-Id: I6b2c716fff991781e36ba21b541ea2ff918cfeb3
Use @property to provide the types of undeclared variables to Phan and
PHPStorm, as in my NodeData patch. Declare $dp->tmp since it is
commonly used and does not affect the JSON serialized output since it is
always stripped.
I omitted the constructor, instead of following the suggestion in the
massageLoadedDataParsoid doc comment which proposed injesting a
JSON-like data structure in the constructor. I thought it would be more
efficient to have the initial property assignments inline in the calling
code. This means breaking up many object cast expressions into
individual assignments.
In IncludeOnly, the coalescing null operator was only handling the case
where $start->dataAttribs was unset, which seems unlikely. I made it so
that it checks whether $start->dataAttribs->tsr is unset.
I added strongly typed clone() methods, to preserve type information for
static analysis.
DataParsoid is the type of the data in both the DOM and in tokens. To
simplify the changes to the Token hierarchy, I removed the duplicate
definitions of the public properties $attribs and $dataAttribs.
Change-Id: I16172083e7e9bcb94601d1d6862d1d202a7e3660
This is even more reliable because it also considers auto-values,
for example.
Depends-On: I522b888e366f066b28983a18041a8728d11623df
Change-Id: If83b9da65be9a759a82e8512ae171f802da9f597
* Most of these are remnants from the Parsoid/JS codebase.
* This change follows the pattern we've been using everywhere
since the port from JS->PHP.
* Also reduces instruction count by about 0.2%.
Change-Id: Ibf21104f6722c34299f03e303dc3401bf053a751
Review regex usage, and use an alternative where possible, to improve
performance.
* Add PHPUtils::stripPrefix() and PHPUtils::stripSuffix(). Benchmark in
doc comment.
* /foo/ -> str_contains()
* /^foo/ -> str_starts_with()
* /^f/ -> ($s[0] ?? '') === 'f'
* /foo$/ -> str_ends_with()
* /^(foo|bar)$/ -> in_array(), benchmark suggests 10x improvement
* preg_replace(/foo/) -> str_replace()
* preg_replace(/^[abc]/) -> strspn(), benchmark suggests 3x improvement.
Curiously, it is faster without a limit for short input strings,
although a limit presumably adds robustness.
* preg_replace(/[abc]+$/) -> rtrim()
* preg_match_all() -> substr_count()
* In DOMUtils::hasTypeOf(), use explode() instead of a regex. Validated
by a benchmark.
* In DOMUtils::addTypeOf(), stop normalizing adjacent spaces. This
allows us to use implode(explode()) without a filtering loop. The
patch to Ext/Cite/References.php was to remove spaces added by this
change. The parserTests.txt changes were a consequence of the
References.php change.
* In LinkHandlerUtils::getHref() I allowed a single bare slash to be
counted as a path-absolute URL since I think that was the intention of
the original code.
* In LinkHandlerUtils::getLinkRoundTripData() I captured the portion of
interest from the previoulinkHandlers regex instead of running a
second regex.
* LinkHandlerUtils::linkHandler() had the regex
/^mw:WikiLink|mw:MediaLink$/ which I think was a bug, missing
parentheses. I fixed the bug.
The margins are pretty tight for a lot of these. Using polyfills for
str_contains() etc. might change the conclusion.
Also:
* In DOMUtils::matchTypeOf(), avoid calling hasAttribute().
getAttribute() is documented as returning an empty string if the
attribute does not exist.
Change-Id: I8d7bdf1bccc869b4dc17058a5822ef34968471e6
Follow up to 47dd898
Also renames a variable to be consistent in the two places we get
contents for the ref.
Change-Id: I13e61b8911ff16549fbb0888b9c3313ed5e7701e
Follow up to 47dd898
Fixes the test case found in rt,
php bin/parse.php --domain ceb.wikipedia.org --pageName "Martin Van Buren" --offsetType ucs2 < /dev/null
The offsetType is necessary so that the ConvertOffsets pass runs. The
crasher here is because the embedded html still contains the sealed ref
fragments because we've stored the unprocessed html.
Change-Id: Ic1e1c3e54433bf6d7574420c2eade1349261de0b
Restores linkbacks for ref-in-ref.
Follow up to 568034a where it's noted that it's fine to maintain
linkbacks for ref-in-ref, as long as the ref isn't a named ref that's
trying to redefine the contents for that name, in which case we embed
the contents.
A test case for this can be,
```
<ref name="hiho">off to work</ref>
{{#tag:ref|<i>we go <ref name="ohno">ohno</ref></i>|name="hiho"}}
{{#tag:ref|<i>we go <ref name="ohno2">ohno2</ref></i>|name="test"}}
```
The linkback to #cite_ref-ohno2_3-0 is present while continuing to
suppress the dangling linkback to #cite_ref-ohno_2-0, since that's in
embedded content. On master, both linkbacks are unnecessarily
suppressed.
Bug: T289331
Change-Id: Ifcf7464e86a4408f5dd9e2fd6d3aa47a0670ca49
This will be helpful in a subsequent patch where we make use of that
data while processing refs in refs. Content differing implies that
we'll be embedding it for roundtripping, rather than putting in the dom.
Change-Id: I7bd1d4c503fc58f862960bec82ca514fc29d7eff
This moves determining if we already have a reference created for a
named ref outside of that function, which is helpful for making use of
the cached html for that ref earlier.
Change-Id: Ie416bd95b980f9f95111d7e420945f40e2ada747
The message was only shown when a new reflist was inserted, or when
any references were changed.
Bug: T284472
Change-Id: I7c1e981c93bf7e163f9fb747aad30a24e9a497f1