Commit graph

112 commits

Author SHA1 Message Date
jenkins-bot 23791622e0 Merge "Use extension namespace for TextExtracts" 2024-09-12 16:05:21 +00:00
jenkins-bot 173012d395 Merge "ExtractFormatter: Rescue headings from being removed" 2024-07-16 15:05:46 +00:00
Bartosz Dziewoński 0fafa44a20 ExtractFormatter: Rescue headings from being removed
Bug: T363445
Change-Id: I662fe3dd06d6b010108c6f0ef891a8c6113b9a45
2024-07-16 15:02:03 +00:00
alistair3149 cc8cf471fc
Use extension namespace for TextExtracts
Change-Id: I01177e9bef0f25b6245ee3e93f605dc771642273
2024-07-12 18:46:33 -04:00
jenkins-bot 4e9b6273f4 Merge "Add extracts to REST search as description" 2024-07-12 17:41:19 +00:00
alistair3149 fe3982c204 Add extracts to REST search as description
Add the config $wgExtractsExtendRestSearch to enable adding the extracts
to REST search as description. It is disabled by default.

Change-Id: I4335f161856b3a4035333de6eb4f547745480d91
2024-07-12 07:10:30 +00:00
Umherirrender 290be2e8de Replace deprecated ApiPageSet::getGoodTitles
PageIdentity does not have inNamespace() (it is from LinkTarget)
PageIdentity does not have getContentModel(), use the WikiPage instead
Inject a TitleFormatter to get the prefixed title text

Bug: T339384
Change-Id: I0029e718f20ca01ee3cd13ada8be04a16480d51d
2024-07-05 22:05:55 +02:00
Bartosz Dziewoński 44a3c538e1 ApiQueryExtracts: Replace custom parsing logic with ParserOutputAccess
Change-Id: I853617651867044cbe2624857ba08753cce332a5
2024-06-20 23:03:39 +02:00
Umherirrender 74bdf0a7dd build: Upgrade mediawiki/mediawiki-codesniffer to v43.0.0
Change-Id: I7a3887f4fac7c4e78e0828fca52d0c355357e5b1
2024-03-12 20:22:40 +01:00
Umherirrender 90040f278a Use namespaced classes
Changes to the use statements done automatically via script
Addition of missing use statement done manually

Change-Id: I45053ede3898b5c886a8182659466d321126ac1f
2024-01-04 22:22:49 +01:00
Reedy 47172479bb ExtractFormatter: Update for HtmlFormatter 4.0.0
Bug: T330528
Depends-On: Id785cfd2e00762ca6c1ea80200dd4a3d197640f2
Change-Id: I496d84fab3ee8feee5891ca984f37e90837d857c
2023-08-22 20:50:03 +00:00
gerritbot 71e31619f2 Replace some moved Title class uses, now MediaWiki\Title\Title
Bug: T321681
Change-Id: I09a9c9f6294c9988feb3c2eba46ffde73996b3d3
2023-08-19 18:08:08 +00:00
Umherirrender ce0bcb5c82 Use HookHandlers for core hooks
The use of "HookHandlers" attribute in extension.json makes it possible
to inject services into hook handler classes in a future patch.

Bug: T271032
Change-Id: I612c09264b830fe5588aafdad80a9eebaa66d71b
2023-08-14 19:49:09 +02:00
Umherirrender cd565f856a i18n: Split apihelp for parameter prop=extracts&exsectionformat=
Easier to translate
There is no visible change on Special:ApiHelp/query+extracts

Bug: T285545
Change-Id: Ide7650148ea3bbf9fa85fb052090f3b13f1b42c5
2023-08-05 02:30:49 +02:00
gerritbot bdd77c5c06 Update moved class FauxRequest
See T321882. Moved in I832b133aaf61ee

Bug: T321681
Change-Id: I6ce2acc3961db1b9663efd62ebdcf57c905caaf0
2023-05-19 10:25:20 +00:00
jenkins-bot 74baaa7ce1 Merge "Skip <h2> in TOC when extracting first section" 2023-03-20 09:23:31 +00:00
Thiemo Kreuz 60e1c5ad83 Skip <h2> in TOC when extracting first section
This piece of code is only relevant in case when:
- the intro section is requested (either in plaintext or html);
- the parse result for the full page is available in the parser cache;
- the full extract is not available in the TextExtracts WAN cache;
- the intro is also not available in the TextExtracts WAN cache.

In this case getFirstSection() is called with the parser output,
which is different from the the convertText() output it is called
with in other code paths, and still contains <h*> tags. A quick
regex is used to extract the first section. This stops at any <h2>.
A TOC also contains a <h2> (which will be removed later via
$wgExtractsRemoveClasses). This one needs to be ignored in case
the TOC is placed before the first section using e.g. the __TOC__
keyword.

The patch changes the regex so it ignores a h2 with
id="mw-toc-heading", but keeps working in plaintext mode when <h*>
tags are not present  (the code path when the intro section is
requested, and the full extract is available in the TextExtracts
WAN cache but the intro extract isn't).

Bug: T269967
Change-Id: I0a495d06cf1725744e556e81f17047fb53f53521
2023-03-20 07:40:07 +00:00
Umherirrender 35f096417f Use ParserOptions::newFromAnon instead of constructor
This avoids the user language to take effect on the parse,
it reflects more the anon part than just "new User()"

Change-Id: Ic6a4a81074a16b85ac2f1c7952f27a03a0c76dec
2021-12-18 20:01:03 +01:00
Alexander Vorwerk abcb71cabb ApiQueryExtracts: inject WikiPageFactory
Bug: T297688
Change-Id: I8faa8a14efcd6cb6b247301aa4da0c1abac7c97b
2021-12-14 23:05:40 +01:00
Reedy 20c3f6d447 Replace use of deprecated MWTidy class
Bump required MW to >= 1.36.0

Change-Id: Ida40e6c1d84eec0e51e53f6aa98ac9f09fd52666
2021-10-20 19:33:39 +01:00
libraryupgrader e15f62f4c3 build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 35.0.0 → 36.0.0
* php-parallel-lint/php-parallel-lint: 1.2.0 → 1.3.0

npm:
* grunt: 1.3.0 → 1.4.0
* lodash: 4.17.19 → 4.17.21
  * https://npmjs.com/advisories/1673 (CVE-2021-23337)

Change-Id: Ie5c8d9cfb856600a32c2bd50d518b4388bde1051
2021-05-14 05:25:02 +00:00
DannyS712 d7f93f5c17 ApiQueryExtracts: remove unneeded factory method
All services can be injected by bumping minimum
version of mediawiki (1.34 would be enough, but
since that is no longer supported require 1.35)

Change-Id: I8bb1573a02932ef5f2871606e94a41afe073fd00
2021-04-01 20:07:30 +00:00
jenkins-bot 40da2c16b9 Merge "Fix API adding ellipsis… when not needed" 2021-03-15 16:17:17 +00:00
jenkins-bot 0262d5409a Merge "Add test for ApiQueryExtracts::truncate()" 2021-03-11 17:57:42 +00:00
Daimona Eaytoy 8755419bfe Stop using deprecated Language methods
Change-Id: I8a4b7c470fdd7aca64667cf5ac93cd5619e7ca06
2021-02-27 15:23:35 +00:00
Thiemo Kreuz 29380b8d27 Fix API adding ellipsis… when not needed
When the text is short enough to be returned as it is, it's very
confusing to see it with an ellipsis added at the end. There is
no more text. It should not look like there is more text.

Change-Id: I7ef205fde6c358a1cbcbb41346a1c9e2a856d8fd
2021-01-08 14:40:06 +01:00
Thiemo Kreuz 471fdd0f89 Add test for ApiQueryExtracts::truncate()
Change-Id: Ia39188fe3ff1b87d82b4c573f8d27629e75c0aa4
2021-01-08 09:03:51 +01:00
Thiemo Kreuz ee8d932de2 Fix minor deprecations and incomplete PHPDoc tags
Change-Id: I8c331d269bf5dcd177dd1ab9d5f6d1c83f53e40b
2021-01-08 08:36:47 +01:00
zoranzoki21 392183ccdc Fix all PHPCS excludes
Change-Id: I79b32c6438ec5b73909fe08c48e55eab8b411452
2020-11-05 17:48:53 +01:00
Max Semenik f225c911a7 Reduce the amount of annoying limit warnings
Don't warn if the limit is >1 but there's only 1 title to process
because there's nothing wrong with the current request.

Change-Id: Ia991e58420d31520cb83b24b1183d526fd79edb2
2020-06-26 19:16:37 +00:00
jenkins-bot 47f07178ee Merge "Tidy is no longer configurable in MW 1.35" 2020-06-15 12:31:10 +00:00
Reedy 51c8f66727 Fix PSR12.Properties.ConstantVisibility.NotFound
Bug: T253169
Change-Id: I6512139d76e1f6ae232e8e9484b79a795b520f1c
2020-05-30 01:35:27 +01:00
C. Scott Ananian c1397847c0 Tidy is no longer configurable in MW 1.35
Remove use of deprecated MWTidy::isEnabled() and internal
MWTidy::singleton() methods.  See I3584181070da7ed4888beaaf04e083114aca1eab
for context.

Bug: T198214
Change-Id: I511068cc7b2398773a837f66e08def206cbb5626
2020-05-02 01:31:21 -04:00
Timo Tijhof 9d3ee77a95 tests: Remove PHP 7.4 workaround
Follows-up 955e0bb. Some of the other test cases already did this,
so let's do it here as well.

Change-Id: Ib39b03a38ff0d444568980db39a4d9b1e54618b7
2020-03-13 22:54:52 +00:00
Aaron Schulz e9d466f398 Convert $wgMemc use to WANObjectCache
Bug: T160813
Change-Id: If298927d6b90e1b94e83485e723f13aa2bad0932
2020-03-13 21:07:36 +00:00
Thiemo Kreuz 955e0bb5bb Fix PHP 7.4 compatibility
The way this test is set up means the $this->params property is not
initialized. It is only initialized when execute() is, well, executed.
Since there is not really a guarantee this will always happen before
the failing method is called, I figured it's better to add this cheap
safety check in the production code.

Taggign with T233012 because I believe this extension is a gated one
for many other codebases.

Bug: T233012
Change-Id: Ie0060125cf4646d80f8c88eedd01551f66e3fb89
2020-03-13 20:59:58 +00:00
libraryupgrader 7025be47c1 build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 28.0.0 → 29.0.0
  The following sniffs are failing and were disabled:
  * MediaWiki.Commenting.FunctionComment.MissingDocumentationPrivate

npm:
* eslint-config-wikimedia: 0.12.0 → 0.15.0
* grunt-eslint: 21.0.0 → 22.0.0

Additional changes:
* Also sorted "composer fix" command to run phpcbf last.
* Removing manual reportUnusedDisableDirectives for eslint.

Change-Id: I351f0a333fd5f06e47f0748aa25cb3fff63cc67f
2020-01-15 09:17:22 +00:00
libraryupgrader 5986441375 build: Updating mediawiki/mediawiki-phan-config to 0.9.0
Don't pass booleans to BagOStuff::makeKey()

Change-Id: I87e42cee60d0adefd94f4bdc7fbbfabc65b7c93e
2019-12-21 23:44:50 -08:00
DannyS712 4c94bec18f Use Special:MyLanguage in API help links
Bug: T231269
Change-Id: I242981f4f7ecbd31fe4052daee8652089f4c6694
2019-08-27 06:44:46 +00:00
Thiemo Kreuz 8de415c4fd Fix truncate code potentially removing whitespace from extract
By turning the (?:…) into (?=…) they become lookaheads and are not
part of the returned string in $tail any more. This is exactly what we
want here. All we want is to *know* if the dot, question or exclamation
mark is followed by a space. But we don't need the space captured.

Change-Id: I4be715c4c084165e5ab25da77609f12ffce4d385
2019-05-03 08:46:29 +02:00
Thiemo Kreuz 81fd92685a Move Tidy functionality to TextTruncator
I argue that the code fixing unclosed HTML tags is – even if optional –
an integral part of the code that potentially breaks these HTML tags in
the first place. Notice how much code disappears in the ApiQueryExtracts
class.

Additionally, the new approach uses an interface instead of a static
function call that is impossible to mock and hard to test.

Change-Id: Ic1a65995f4dba11d060a8738d642905cbfc79271
2019-05-03 08:46:27 +02:00
jenkins-bot 8567e067f2 Merge "Extract unrelated static code from ExtractFormatter" 2019-05-02 21:03:58 +00:00
Thiemo Kreuz 8d3ff14a93 Consistently mention the @license in all files
Note how only two files mentioned the license before. For consistency
it should be either all or none. Both solutions would be possible. Even
*not* mentioning the license anywhere in these files would be fine from
a legal perspective, as long as the relevant file COPYING is still
there in the root folder of this extension.

The overly long "deed" text does not serve much of a purpose. It's not a
complete, legally relevant license text. It's hard to read as the fact
this is "GPL2+" is surprisingly hard to find. The @license tag solves
these problems, and is recognized by documentation generators.

Change-Id: I7844be0c5f4f3d7562156cd9f34fe466552a9c9d
2019-04-24 18:26:53 +02:00
Thiemo Kreuz a0d37fcb51 Extract unrelated static code from ExtractFormatter
This is a straightforward baseline patch that does nothing but moving
existing code around, without touching it. I'm not even trying to
remove the "static" keyword. The actual refactoring will be done in
the next patch. I hope with this the changes I do in the refactoring
become more visible and much easier to review.

Change-Id: Idba859ec0c24f3622ea8fb8d7a9b11843d1e3827
2019-03-21 12:38:13 +00:00
Thiemo Kreuz 6a082f1764 Inline nested callback functions
This gets rid of code that is reported as being unused, even if it is
used.

This also simplifies the regular expression a little bit. The .
automatically ends at the end of the line when the mode /s is *not* set,
which it isn't. The /m mode is not needed then because there is no ^ or
$ any more in the regular expression.

Note this code is sufficentily covered by a test (one I wrote just a
few days ago).

Change-Id: I8eb57e308bb2b281e0e72499b4d46f93a4dfa5f4
2019-03-19 18:50:03 +01:00
Max Semenik b9a21cc865 Better way to detect if Tidy is on
Change-Id: Ie9723ef50d9a472605da92faabced3d852ec9387
2019-03-17 18:19:34 -07:00
Max Semenik 1bdb8410ac Get rid of useless ApiQueryExtracts->parserOptions
Change-Id: I084116998ba758c90f14a370c24d098cbd6cdc28
2019-03-17 18:02:11 -07:00
Max Semenik 0215eae3aa Make ExtractFormatter not depend on configuration
Change-Id: I4e9a0947bf50d062ea28004bde30d2e8b18788a4
2019-03-17 18:02:09 -07:00
Max Semenik 1017e3ab72 Remove compat with old MW
Change-Id: Ic5c44414b49e434a8c46ba3dca01eebe9e0f1d3c
2019-03-17 13:47:38 -07:00
Fomafix 375f6d3574 Use PHP7 syntax features
* Use the ?? operator.
* Use "\u{00A0}" instead of "\xC2\xA0".

Also increase the minimum required MediaWiki version from 1.30 to 1.31
because 1.31 requires PHP7.

Change-Id: Ic5c279976f50b381cec65e74b7cc821a210c2173
2019-02-02 21:58:54 +01:00