Commit graph

102 commits

Author SHA1 Message Date
Reedy 47172479bb ExtractFormatter: Update for HtmlFormatter 4.0.0
Bug: T330528
Depends-On: Id785cfd2e00762ca6c1ea80200dd4a3d197640f2
Change-Id: I496d84fab3ee8feee5891ca984f37e90837d857c
2023-08-22 20:50:03 +00:00
gerritbot 71e31619f2 Replace some moved Title class uses, now MediaWiki\Title\Title
Bug: T321681
Change-Id: I09a9c9f6294c9988feb3c2eba46ffde73996b3d3
2023-08-19 18:08:08 +00:00
Umherirrender ce0bcb5c82 Use HookHandlers for core hooks
The use of "HookHandlers" attribute in extension.json makes it possible
to inject services into hook handler classes in a future patch.

Bug: T271032
Change-Id: I612c09264b830fe5588aafdad80a9eebaa66d71b
2023-08-14 19:49:09 +02:00
Umherirrender cd565f856a i18n: Split apihelp for parameter prop=extracts&exsectionformat=
Easier to translate
There is no visible change on Special:ApiHelp/query+extracts

Bug: T285545
Change-Id: Ide7650148ea3bbf9fa85fb052090f3b13f1b42c5
2023-08-05 02:30:49 +02:00
gerritbot bdd77c5c06 Update moved class FauxRequest
See T321882. Moved in I832b133aaf61ee

Bug: T321681
Change-Id: I6ce2acc3961db1b9663efd62ebdcf57c905caaf0
2023-05-19 10:25:20 +00:00
jenkins-bot 74baaa7ce1 Merge "Skip <h2> in TOC when extracting first section" 2023-03-20 09:23:31 +00:00
Thiemo Kreuz 60e1c5ad83 Skip <h2> in TOC when extracting first section
This piece of code is only relevant in case when:
- the intro section is requested (either in plaintext or html);
- the parse result for the full page is available in the parser cache;
- the full extract is not available in the TextExtracts WAN cache;
- the intro is also not available in the TextExtracts WAN cache.

In this case getFirstSection() is called with the parser output,
which is different from the the convertText() output it is called
with in other code paths, and still contains <h*> tags. A quick
regex is used to extract the first section. This stops at any <h2>.
A TOC also contains a <h2> (which will be removed later via
$wgExtractsRemoveClasses). This one needs to be ignored in case
the TOC is placed before the first section using e.g. the __TOC__
keyword.

The patch changes the regex so it ignores a h2 with
id="mw-toc-heading", but keeps working in plaintext mode when <h*>
tags are not present  (the code path when the intro section is
requested, and the full extract is available in the TextExtracts
WAN cache but the intro extract isn't).

Bug: T269967
Change-Id: I0a495d06cf1725744e556e81f17047fb53f53521
2023-03-20 07:40:07 +00:00
Umherirrender 35f096417f Use ParserOptions::newFromAnon instead of constructor
This avoids the user language to take effect on the parse,
it reflects more the anon part than just "new User()"

Change-Id: Ic6a4a81074a16b85ac2f1c7952f27a03a0c76dec
2021-12-18 20:01:03 +01:00
Alexander Vorwerk abcb71cabb ApiQueryExtracts: inject WikiPageFactory
Bug: T297688
Change-Id: I8faa8a14efcd6cb6b247301aa4da0c1abac7c97b
2021-12-14 23:05:40 +01:00
Reedy 20c3f6d447 Replace use of deprecated MWTidy class
Bump required MW to >= 1.36.0

Change-Id: Ida40e6c1d84eec0e51e53f6aa98ac9f09fd52666
2021-10-20 19:33:39 +01:00
libraryupgrader e15f62f4c3 build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 35.0.0 → 36.0.0
* php-parallel-lint/php-parallel-lint: 1.2.0 → 1.3.0

npm:
* grunt: 1.3.0 → 1.4.0
* lodash: 4.17.19 → 4.17.21
  * https://npmjs.com/advisories/1673 (CVE-2021-23337)

Change-Id: Ie5c8d9cfb856600a32c2bd50d518b4388bde1051
2021-05-14 05:25:02 +00:00
DannyS712 d7f93f5c17 ApiQueryExtracts: remove unneeded factory method
All services can be injected by bumping minimum
version of mediawiki (1.34 would be enough, but
since that is no longer supported require 1.35)

Change-Id: I8bb1573a02932ef5f2871606e94a41afe073fd00
2021-04-01 20:07:30 +00:00
jenkins-bot 40da2c16b9 Merge "Fix API adding ellipsis… when not needed" 2021-03-15 16:17:17 +00:00
jenkins-bot 0262d5409a Merge "Add test for ApiQueryExtracts::truncate()" 2021-03-11 17:57:42 +00:00
Daimona Eaytoy 8755419bfe Stop using deprecated Language methods
Change-Id: I8a4b7c470fdd7aca64667cf5ac93cd5619e7ca06
2021-02-27 15:23:35 +00:00
Thiemo Kreuz 29380b8d27 Fix API adding ellipsis… when not needed
When the text is short enough to be returned as it is, it's very
confusing to see it with an ellipsis added at the end. There is
no more text. It should not look like there is more text.

Change-Id: I7ef205fde6c358a1cbcbb41346a1c9e2a856d8fd
2021-01-08 14:40:06 +01:00
Thiemo Kreuz 471fdd0f89 Add test for ApiQueryExtracts::truncate()
Change-Id: Ia39188fe3ff1b87d82b4c573f8d27629e75c0aa4
2021-01-08 09:03:51 +01:00
Thiemo Kreuz ee8d932de2 Fix minor deprecations and incomplete PHPDoc tags
Change-Id: I8c331d269bf5dcd177dd1ab9d5f6d1c83f53e40b
2021-01-08 08:36:47 +01:00
zoranzoki21 392183ccdc Fix all PHPCS excludes
Change-Id: I79b32c6438ec5b73909fe08c48e55eab8b411452
2020-11-05 17:48:53 +01:00
Max Semenik f225c911a7 Reduce the amount of annoying limit warnings
Don't warn if the limit is >1 but there's only 1 title to process
because there's nothing wrong with the current request.

Change-Id: Ia991e58420d31520cb83b24b1183d526fd79edb2
2020-06-26 19:16:37 +00:00
jenkins-bot 47f07178ee Merge "Tidy is no longer configurable in MW 1.35" 2020-06-15 12:31:10 +00:00
Reedy 51c8f66727 Fix PSR12.Properties.ConstantVisibility.NotFound
Bug: T253169
Change-Id: I6512139d76e1f6ae232e8e9484b79a795b520f1c
2020-05-30 01:35:27 +01:00
C. Scott Ananian c1397847c0 Tidy is no longer configurable in MW 1.35
Remove use of deprecated MWTidy::isEnabled() and internal
MWTidy::singleton() methods.  See I3584181070da7ed4888beaaf04e083114aca1eab
for context.

Bug: T198214
Change-Id: I511068cc7b2398773a837f66e08def206cbb5626
2020-05-02 01:31:21 -04:00
Timo Tijhof 9d3ee77a95 tests: Remove PHP 7.4 workaround
Follows-up 955e0bb. Some of the other test cases already did this,
so let's do it here as well.

Change-Id: Ib39b03a38ff0d444568980db39a4d9b1e54618b7
2020-03-13 22:54:52 +00:00
Aaron Schulz e9d466f398 Convert $wgMemc use to WANObjectCache
Bug: T160813
Change-Id: If298927d6b90e1b94e83485e723f13aa2bad0932
2020-03-13 21:07:36 +00:00
Thiemo Kreuz 955e0bb5bb Fix PHP 7.4 compatibility
The way this test is set up means the $this->params property is not
initialized. It is only initialized when execute() is, well, executed.
Since there is not really a guarantee this will always happen before
the failing method is called, I figured it's better to add this cheap
safety check in the production code.

Taggign with T233012 because I believe this extension is a gated one
for many other codebases.

Bug: T233012
Change-Id: Ie0060125cf4646d80f8c88eedd01551f66e3fb89
2020-03-13 20:59:58 +00:00
libraryupgrader 7025be47c1 build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 28.0.0 → 29.0.0
  The following sniffs are failing and were disabled:
  * MediaWiki.Commenting.FunctionComment.MissingDocumentationPrivate

npm:
* eslint-config-wikimedia: 0.12.0 → 0.15.0
* grunt-eslint: 21.0.0 → 22.0.0

Additional changes:
* Also sorted "composer fix" command to run phpcbf last.
* Removing manual reportUnusedDisableDirectives for eslint.

Change-Id: I351f0a333fd5f06e47f0748aa25cb3fff63cc67f
2020-01-15 09:17:22 +00:00
libraryupgrader 5986441375 build: Updating mediawiki/mediawiki-phan-config to 0.9.0
Don't pass booleans to BagOStuff::makeKey()

Change-Id: I87e42cee60d0adefd94f4bdc7fbbfabc65b7c93e
2019-12-21 23:44:50 -08:00
DannyS712 4c94bec18f Use Special:MyLanguage in API help links
Bug: T231269
Change-Id: I242981f4f7ecbd31fe4052daee8652089f4c6694
2019-08-27 06:44:46 +00:00
Thiemo Kreuz 8de415c4fd Fix truncate code potentially removing whitespace from extract
By turning the (?:…) into (?=…) they become lookaheads and are not
part of the returned string in $tail any more. This is exactly what we
want here. All we want is to *know* if the dot, question or exclamation
mark is followed by a space. But we don't need the space captured.

Change-Id: I4be715c4c084165e5ab25da77609f12ffce4d385
2019-05-03 08:46:29 +02:00
Thiemo Kreuz 81fd92685a Move Tidy functionality to TextTruncator
I argue that the code fixing unclosed HTML tags is – even if optional –
an integral part of the code that potentially breaks these HTML tags in
the first place. Notice how much code disappears in the ApiQueryExtracts
class.

Additionally, the new approach uses an interface instead of a static
function call that is impossible to mock and hard to test.

Change-Id: Ic1a65995f4dba11d060a8738d642905cbfc79271
2019-05-03 08:46:27 +02:00
jenkins-bot 8567e067f2 Merge "Extract unrelated static code from ExtractFormatter" 2019-05-02 21:03:58 +00:00
Thiemo Kreuz 8d3ff14a93 Consistently mention the @license in all files
Note how only two files mentioned the license before. For consistency
it should be either all or none. Both solutions would be possible. Even
*not* mentioning the license anywhere in these files would be fine from
a legal perspective, as long as the relevant file COPYING is still
there in the root folder of this extension.

The overly long "deed" text does not serve much of a purpose. It's not a
complete, legally relevant license text. It's hard to read as the fact
this is "GPL2+" is surprisingly hard to find. The @license tag solves
these problems, and is recognized by documentation generators.

Change-Id: I7844be0c5f4f3d7562156cd9f34fe466552a9c9d
2019-04-24 18:26:53 +02:00
Thiemo Kreuz a0d37fcb51 Extract unrelated static code from ExtractFormatter
This is a straightforward baseline patch that does nothing but moving
existing code around, without touching it. I'm not even trying to
remove the "static" keyword. The actual refactoring will be done in
the next patch. I hope with this the changes I do in the refactoring
become more visible and much easier to review.

Change-Id: Idba859ec0c24f3622ea8fb8d7a9b11843d1e3827
2019-03-21 12:38:13 +00:00
Thiemo Kreuz 6a082f1764 Inline nested callback functions
This gets rid of code that is reported as being unused, even if it is
used.

This also simplifies the regular expression a little bit. The .
automatically ends at the end of the line when the mode /s is *not* set,
which it isn't. The /m mode is not needed then because there is no ^ or
$ any more in the regular expression.

Note this code is sufficentily covered by a test (one I wrote just a
few days ago).

Change-Id: I8eb57e308bb2b281e0e72499b4d46f93a4dfa5f4
2019-03-19 18:50:03 +01:00
Max Semenik b9a21cc865 Better way to detect if Tidy is on
Change-Id: Ie9723ef50d9a472605da92faabced3d852ec9387
2019-03-17 18:19:34 -07:00
Max Semenik 1bdb8410ac Get rid of useless ApiQueryExtracts->parserOptions
Change-Id: I084116998ba758c90f14a370c24d098cbd6cdc28
2019-03-17 18:02:11 -07:00
Max Semenik 0215eae3aa Make ExtractFormatter not depend on configuration
Change-Id: I4e9a0947bf50d062ea28004bde30d2e8b18788a4
2019-03-17 18:02:09 -07:00
Max Semenik 1017e3ab72 Remove compat with old MW
Change-Id: Ic5c44414b49e434a8c46ba3dca01eebe9e0f1d3c
2019-03-17 13:47:38 -07:00
Fomafix 375f6d3574 Use PHP7 syntax features
* Use the ?? operator.
* Use "\u{00A0}" instead of "\xC2\xA0".

Also increase the minimum required MediaWiki version from 1.30 to 1.31
because 1.31 requires PHP7.

Change-Id: Ic5c279976f50b381cec65e74b7cc821a210c2173
2019-02-02 21:58:54 +01:00
Umherirrender 832d0ff745 Remove use of deprecated UsageException
Deprecated since 1.29, extension required 1.29

Change-Id: I2f550f0b94571afc289af616645b822d63fea4d3
2018-08-21 22:23:49 +02:00
Umherirrender add3e27461 User constructor does not take an argument
User object without argument is the anon default

Change-Id: I2c47c4865386d59f14eb6390b3e12fb9c5198ccd
2018-08-06 23:12:26 +02:00
Thiemo Kreuz 60cd40b975 Remove not needed count() and "return true" from hook handlers
This patch fixes two styls issues I could not separate:

* Hook handler functions do not need to return true. This is the default
anyway, and meaningless.

* Counting is possibly expensive and not needed when all we need to know
is if an array is empty or not.

Change-Id: I460776c981638806a606d9bf88fc8579d6da8c0e
2018-06-28 20:45:20 +02:00
Umherirrender 618aef40a0 Remove @return from __construct
Change-Id: I2000dc076c869620533368431f6c55241fbc92e8
2018-04-05 12:18:59 +02:00
jdlrobson d69b35f4bc Adjust expectations for API consumers when using the TextExtracts API
Bug: T170617
Change-Id: I53e08db40e5319019c842869f992bac32b1dac97
2018-03-20 09:42:10 -07:00
Gergő Tisza be8a5d6ea3
Bump cache version due to 'unwrap' ParserOutput option
Bug: T186927
Change-Id: I078f71d99f3179af5f4f85892472932eb5635fe1
2018-02-09 14:15:30 -08:00
Brad Jorsch 63a1d82b4e Use 'unwrap' post-cache transform instead of setWrapOutputClass( false ), when available
To reduce parser cache fragmentation, core is deprecating
$parserOptions->setWrapOutputClass( false ) in favor of
$parserOutput->getText( [ 'unwrap' => true ] );

Change-Id: Ibc013a41f4a463f4014fbbce7ce27f8690161728
2017-12-22 13:43:44 -05:00
Pppery f6fd9273c5 Re-enable MediaWiki.Commenting.FunctionComment.MissingDocumentationPublic sniff
Bug: T170580
Change-Id: I0a0055f1de57f15a45c21e2f51ed275a2b249440
2017-11-30 15:31:55 -05:00
jenkins-bot 95dc34e4c7 Merge "Re-enable MediaWiki.WhiteSpace.SpaceBeforeSingleLineComment.NewLineComment sniff" 2017-11-30 19:11:09 +00:00
Pppery d05f289032 Re-enable MediaWiki.WhiteSpace.SpaceBeforeSingleLineComment.NewLineComment sniff
Bug: T170580
Change-Id: Ib5bcab3414f44013cf57c0d006b212dea175473a
2017-11-29 23:07:30 -05:00