Commit graph

31 commits

Author SHA1 Message Date
Reedy 47172479bb ExtractFormatter: Update for HtmlFormatter 4.0.0
Bug: T330528
Depends-On: Id785cfd2e00762ca6c1ea80200dd4a3d197640f2
Change-Id: I496d84fab3ee8feee5891ca984f37e90837d857c
2023-08-22 20:50:03 +00:00
Reedy 51c8f66727 Fix PSR12.Properties.ConstantVisibility.NotFound
Bug: T253169
Change-Id: I6512139d76e1f6ae232e8e9484b79a795b520f1c
2020-05-30 01:35:27 +01:00
Thiemo Kreuz 81fd92685a Move Tidy functionality to TextTruncator
I argue that the code fixing unclosed HTML tags is – even if optional –
an integral part of the code that potentially breaks these HTML tags in
the first place. Notice how much code disappears in the ApiQueryExtracts
class.

Additionally, the new approach uses an interface instead of a static
function call that is impossible to mock and hard to test.

Change-Id: Ic1a65995f4dba11d060a8738d642905cbfc79271
2019-05-03 08:46:27 +02:00
jenkins-bot 8567e067f2 Merge "Extract unrelated static code from ExtractFormatter" 2019-05-02 21:03:58 +00:00
Thiemo Kreuz 8d3ff14a93 Consistently mention the @license in all files
Note how only two files mentioned the license before. For consistency
it should be either all or none. Both solutions would be possible. Even
*not* mentioning the license anywhere in these files would be fine from
a legal perspective, as long as the relevant file COPYING is still
there in the root folder of this extension.

The overly long "deed" text does not serve much of a purpose. It's not a
complete, legally relevant license text. It's hard to read as the fact
this is "GPL2+" is surprisingly hard to find. The @license tag solves
these problems, and is recognized by documentation generators.

Change-Id: I7844be0c5f4f3d7562156cd9f34fe466552a9c9d
2019-04-24 18:26:53 +02:00
Thiemo Kreuz a0d37fcb51 Extract unrelated static code from ExtractFormatter
This is a straightforward baseline patch that does nothing but moving
existing code around, without touching it. I'm not even trying to
remove the "static" keyword. The actual refactoring will be done in
the next patch. I hope with this the changes I do in the refactoring
become more visible and much easier to review.

Change-Id: Idba859ec0c24f3622ea8fb8d7a9b11843d1e3827
2019-03-21 12:38:13 +00:00
Max Semenik 0215eae3aa Make ExtractFormatter not depend on configuration
Change-Id: I4e9a0947bf50d062ea28004bde30d2e8b18788a4
2019-03-17 18:02:09 -07:00
Fomafix 375f6d3574 Use PHP7 syntax features
* Use the ?? operator.
* Use "\u{00A0}" instead of "\xC2\xA0".

Also increase the minimum required MediaWiki version from 1.30 to 1.31
because 1.31 requires PHP7.

Change-Id: Ic5c279976f50b381cec65e74b7cc821a210c2173
2019-02-02 21:58:54 +01:00
Pppery f6fd9273c5 Re-enable MediaWiki.Commenting.FunctionComment.MissingDocumentationPublic sniff
Bug: T170580
Change-Id: I0a0055f1de57f15a45c21e2f51ed275a2b249440
2017-11-30 15:31:55 -05:00
jenkins-bot 95dc34e4c7 Merge "Re-enable MediaWiki.WhiteSpace.SpaceBeforeSingleLineComment.NewLineComment sniff" 2017-11-30 19:11:09 +00:00
Pppery d05f289032 Re-enable MediaWiki.WhiteSpace.SpaceBeforeSingleLineComment.NewLineComment sniff
Bug: T170580
Change-Id: Ib5bcab3414f44013cf57c0d006b212dea175473a
2017-11-29 23:07:30 -05:00
Pppery 009765a04c Re-enable MediaWiki.Commenting.FunctionComment.MissingParamComment sniff
Also renames $action to $name in APIQueryExtracts.php, because trying to
document the parameter revealed that "action" doesn't match the use of
the parameter.

Bug: T170580
Change-Id: I1b7f3f0e17b118ea9bcfd28c69321aa692aad4e3
2017-11-29 21:56:29 -05:00
libraryupgrader 7e548ce1b4 build: Updating mediawiki/mediawiki-codesniffer to 0.12.0
Change-Id: I3e1260e19de4a12c995b51a1a4416dbdf87829cf
2017-09-01 04:58:00 +00:00
Piotr Miazga 91bbe7b10a Hygiene: Remove deprecation and unused import
Changes:
 - ApiBase::setWarning() is deprecated, use addWarning() instead
 - ParserCache::singleton() is deprecated, use MediaWikiServices instead
 - Exception import is not used, drop it
 - added MediaWiki 1.29 as a requirement

Bug: T166714
Change-Id: Ib81e5acbb28e1f803c7a792b9f990f2aa6d57521
2017-08-02 16:32:13 +02:00
Max Semenik 21ef48483f getFirstSentences(): don't use crazy regexes
Bug: T145231
Change-Id: I820fb152e86b273ddeba1617658a13e3a3f0bae3
2017-01-20 10:13:46 -08:00
Max Semenik fb2c163345 Uncomment and fix a test
Change-Id: I57facf073dd688f57f35a18015a0aa14b7b7f4c4
2017-01-19 16:16:35 -08:00
Max Semenik abb0f4df96 getFirstChars(): don't use quantifiers with user-supplied count
Bug: T143178
Change-Id: Iba6d929156040f5388461aaf075644d8fbf647be
2017-01-10 17:42:14 -08:00
Max Semenik 264f65215b Minor fixes
* Annotations
* Deprecated functions
* Namespace tests

Change-Id: I521f6af6074a454cec5322ab4cd46db08350c2c3
2016-09-22 18:51:12 -07:00
Max Semenik 754c9e4f19 CodeSniffer fixes
Change-Id: I8bdcd2250bd3163fe40ce4685eb04bffe53afdca
2016-09-22 18:38:27 -07:00
jenkins-bot fbe7379738 Merge "The last sentence of the paragraph was lost." 2016-04-14 00:00:21 +00:00
Max Semenik 9bc33683a0 Switch to librarized HtmlFormatter
Bug: T125001
Change-Id: Iac73553ac4b03e75ef321c6a659ece1ac155260b
2016-04-12 21:23:01 -07:00
Sergey Leschina ae7fe951f1 The last sentence of the paragraph was lost.
Change-Id: I963ca71b73dc7396156e8b5fcf5d2952e4abbc05
2016-04-11 02:08:14 +03:00
Sergey Leschina 472d84c9de Fix separation of text into sentences.
Some space characters like   or $thinsp; usually is not indicate to the end of sentence, so shouldn't be used as separators.

Bug: T115817
Change-Id: Ieb56b0ef723dd299f848ea88b66613d92977bef0
2016-04-01 10:49:17 +03:00
Sumit Asthana 13d6592978 TextExtracts do not crop after initials
Disables sentence termination at a full stop preceeded by a capital
alphabet which is likely to be an initial.

Bug: T115795
Change-Id: Ibf38e87823155c704ffb106642944cbd05e3f632
2015-12-03 07:11:36 +05:30
Sumit Asthana d83ac976e3 TextExtracts allow sentence end with numbers
Allows sentences to end with numbers before a full stop in query
extractsentences.

Also added some more unit tests.

Bug: T118621
Change-Id: I9cbf487601d4165b490696d38d5fcbcf6d8f4637
2015-11-18 20:11:20 -06:00
csteipp 97495d1ff3 Ensure sentences is an int
In the spirit of escaping as close to the output as possible, ensure
that the number of sentences is an integer before using it in a regex.
Just in case someone changes the api's param definition.

Change-Id: I406d6ed365ecd53bd8f56a09218a7e1403fe0fa9
2015-03-24 12:54:54 -07:00
Chad Horohoe d9869ef8d0 Remove obvious function-level profiling
Change-Id: I0c272eb337566eff28d46d198c9aa065ffdbddb2
2015-02-11 08:49:13 -08:00
jenkins-bot 1c58fd6df9 Merge "Don't flatten spans" 2015-01-13 20:40:04 +00:00
Sam Smith 59633e2be9 Don't flatten spans
... so that per-span information for different languages, i.e. lang and
dir attributes aren't lost.

Bug: T59582
Change-Id: If1b04714fdc0f4d581ddb858d8d53f6f340dc10b
2015-01-13 16:31:01 +00:00
Ori Livneh 23dcce746a MWException -> Exception
Change-Id: If111014ef2d7aea5c72bdcf4600a9067e2e21e00
2015-01-09 19:06:21 -08:00
Max Semenik fbd8e93a8b Reorg: move hooks to a separate class, introduce namespaces
Change-Id: Ic784010e79b1168f0e112cf912f463036255eb64
2014-12-31 15:05:19 -08:00
Renamed from ExtractFormatter.php (Browse further)