Commit graph

44 commits

Author SHA1 Message Date
jenkins-bot d72b09724e Merge "Remove deprecated $wgUseTidy in favour of $wgTidyConfig" 2017-07-07 22:49:21 +00:00
Baha 97a25e2183 Return empty extract for articles in File namespace
Bug: T114418
Change-Id: I2dfccbcf27284ecfdd0669b004151824ece79b73
2017-07-07 15:32:27 -07:00
Piotr Miazga c13dae2788 Remove deprecated $wgUseTidy in favour of $wgTidyConfig
Bug: T168671
Change-Id: I27f5bee2448797c3a5a8cb886cee0e518b199ebe
2017-07-07 21:39:30 +00:00
jenkins-bot 698a8a8066 Merge "Send sectionpreview parameter on TextExtract parse" 2017-06-29 00:25:10 +00:00
jdlrobson 27baa2d0d9 Send sectionpreview parameter on TextExtract parse
This will invoke special handling for unbalanced templates

Bug: T168743
Change-Id: I3fe1bd5b56a049f57fad478f1358dd8496503b41
2017-06-28 08:55:00 -07:00
Kunal Mehta 43f3539a7c Set an expiry for memcache entries
Use the same expiry as the parser cache since this is a derivative of
the parser cache.

And avoid wfMemcKey while we're at it.

Change-Id: Ieba084aff4b8beb180da01d9cc4b8a2857569171
2017-06-11 20:12:55 +00:00
Brad Jorsch 1f1c7e639d Chunk page ids in internal API call to avoid too-many-pageids-for-query
One of many reasons that internal API calls are bad.

Bug: T41936
Change-Id: I3d2cf2b4f619f590e74a88fa4a78832b8be8495e
2017-05-26 17:17:21 -04:00
Baha 182304dc6d API: Limit maximum number of characters when exchars is passed.
Set the limit to 1200 characters.

Bug: T156467
Change-Id: I4e53b26a3f57f5f5cf7acbd3702c8bc4541a5eb5
2017-05-24 18:04:20 -04:00
jenkins-bot a803755b3e Merge "Add phpcs and make pass" 2017-05-24 13:17:49 +00:00
Umherirrender 93be5e75f6 Add phpcs and make pass
Change-Id: I2f95b3dfa260d955a5a420d0bf3c914382c09746
2017-05-19 18:39:27 +02:00
Baha 6bfe60508a Increase default API limit from 1 to 20
Bug: T153707
Change-Id: I6ba3adb7c680e1a60461cd3903cbf8640721ea02
2017-05-19 09:39:13 -04:00
jenkins-bot 7c81c6ec9f Merge "Suppress parser output wrapper div" 2017-05-16 14:49:40 +00:00
Brad Jorsch 42e87ac3b6 Suppress parser output wrapper div
It confuses the code that munges the HTML to produce an extract.

By itself this won't fix the bug, but together with a core change to
avoid polluting the parser cache such as I5be25c6d it should work.

Bug: T165161
Change-Id: Ia1b654bf659958c04d7e370d4686cf17f615b591
2017-05-16 14:47:49 +00:00
Kunal Mehta aef292b82b API: Change memcache key to clear cache
And add a CACHE_VERSION constant to make this easy in the future.

Bug: T165161
Change-Id: I362f44cf3d680d073e7e6dc6eec95ec5eec15684
2017-05-12 11:19:51 -07:00
Max Semenik 21ef48483f getFirstSentences(): don't use crazy regexes
Bug: T145231
Change-Id: I820fb152e86b273ddeba1617658a13e3a3f0bae3
2017-01-20 10:13:46 -08:00
Max Semenik fb2c163345 Uncomment and fix a test
Change-Id: I57facf073dd688f57f35a18015a0aa14b7b7f4c4
2017-01-19 16:16:35 -08:00
Max Semenik abb0f4df96 getFirstChars(): don't use quantifiers with user-supplied count
Bug: T143178
Change-Id: Iba6d929156040f5388461aaf075644d8fbf647be
2017-01-10 17:42:14 -08:00
Brad Jorsch 739e02f2d2 Update for API error i18n
See Iae0e2ce3.

Change-Id: Ibe7cb02d551ac2f85ee01edbf2b40a966ed42b74
2016-12-08 10:08:25 -05:00
Max Semenik ec44826b7a Remove use of a removed function
Change-Id: Iac8bec0a0a2625e40c5c70a715ffb4784224f164
2016-09-22 18:58:55 -07:00
Max Semenik 264f65215b Minor fixes
* Annotations
* Deprecated functions
* Namespace tests

Change-Id: I521f6af6074a454cec5322ab4cd46db08350c2c3
2016-09-22 18:51:12 -07:00
Max Semenik 754c9e4f19 CodeSniffer fixes
Change-Id: I8bdcd2250bd3163fe40ce4685eb04bffe53afdca
2016-09-22 18:38:27 -07:00
Brad Jorsch 90a025b839 Remove pre-1.25 API compatibility code
Since this extension uses extension.json, it already requires 1.25+ so
no need to keep the old code around.

Change-Id: Id9e8fb026b26bb4db34fb22bd631205ce6f7072b
2016-09-20 15:23:35 -04:00
Fomafix 579fae4c38 API: Remove unused parameter exvariant
All API calls supports the generic parameter variant.
With I8a31dfd3cf2a3e8f768907084d26a77f198ccbe3 in core this parameter
is documented and generates no warning anymore.

Bug: T117529
Change-Id: Ic7e6f1df99c67ad4132c22503d99345611af271a
2016-09-19 18:00:19 +00:00
Reedy ad435fb4e1 Remove 'UnitTestList' hook
No longer needed now that extension unittests are autodiscovered.

Bug: T142120
Bug: T142121
Change-Id: Iaff2e40a8bddfd5d45170b49641b8afa15987527
2016-08-23 14:54:47 +01:00
jenkins-bot fbe7379738 Merge "The last sentence of the paragraph was lost." 2016-04-14 00:00:21 +00:00
Max Semenik 9bc33683a0 Switch to librarized HtmlFormatter
Bug: T125001
Change-Id: Iac73553ac4b03e75ef321c6a659ece1ac155260b
2016-04-12 21:23:01 -07:00
Sergey Leschina ae7fe951f1 The last sentence of the paragraph was lost.
Change-Id: I963ca71b73dc7396156e8b5fcf5d2952e4abbc05
2016-04-11 02:08:14 +03:00
Sergey Leschina 472d84c9de Fix separation of text into sentences.
Some space characters like   or $thinsp; usually is not indicate to the end of sentence, so shouldn't be used as separators.

Bug: T115817
Change-Id: Ieb56b0ef723dd299f848ea88b66613d92977bef0
2016-04-01 10:49:17 +03:00
Kunal Mehta 0664ddbf94 Add missing use statement
Removed another unused one, and cleaned up the doc block.

Bug: T121283
Change-Id: I1a8b9920152e6d52ffb59de385fcc29c92f33c92
2015-12-11 15:11:43 -08:00
mhutti1 80703452ed Converted TextExtracts to new extension registration system
Moved most of TextExtracts.php to the new extension.json
and added method for backward compatable implementation
of the extension if still called though the php file. Moved
unit test hook to Hooks.php and deleted old il8n.php.

Bug: T87979
Change-Id: I3d26bd931ad2941268b94474f3e6327282da24ec
2015-12-10 22:59:49 +01:00
Sumit Asthana 13d6592978 TextExtracts do not crop after initials
Disables sentence termination at a full stop preceeded by a capital
alphabet which is likely to be an initial.

Bug: T115795
Change-Id: Ibf38e87823155c704ffb106642944cbd05e3f632
2015-12-03 07:11:36 +05:30
Sumit Asthana d83ac976e3 TextExtracts allow sentence end with numbers
Allows sentences to end with numbers before a full stop in query
extractsentences.

Also added some more unit tests.

Bug: T118621
Change-Id: I9cbf487601d4165b490696d38d5fcbcf6d8f4637
2015-11-18 20:11:20 -06:00
Kunal Mehta 36d1b4f3c4 Use page_touched in cache key instead of page_latest
Because the extracts depend upon template inclusion, to make sure
the extract is properly updated whenever the page's dependencies change,
use the page_touched timestamp instead of the latest revision id.

Since we're changing the cache key format, remove the 'mf' prefix from
back when it was still in MobileFrontend.

As a side-effect, this will also make action=purge invalidate the cache
since it updates page_touched.

Bug: T117322
Change-Id: Ib6f415c756c57caf6c83be495a4f229446e8b61e
2015-10-31 22:00:51 -07:00
Matthew Flaschen 63b358fca2 SECURITY: Disallow extracts for non-wikitext for now.
Note that the sensitive information is still in the TextExtracts
memcached, so this requires security review (and either eviction
or a cache key change) before enabling other content models.

Bug: T107170
Change-Id: I57642e84db39d585c5b04453f86102b10fb69cdf
(cherry picked from commit f5c114c571)
2015-08-04 00:08:43 +00:00
jenkins-bot 0285c9e033 Merge "Ensure sentences is an int" 2015-07-15 20:35:34 +00:00
Ori Livneh 7c1ea48971 Update for rename of WikiPage::isParserCacheUsed() in I7de67937f0
Make the code compatible with both the old name (WikiPage::isParserCacheUsed)
and new name (WikiPage::shouldCheckParserCache).

Change-Id: If5d5da8eab132eb6d60f7141884ed2aeaa46e444
2015-06-22 20:44:23 -07:00
Brad Jorsch 95002e7a59 Further cleanup for core API change
PS25 and later changed things around a fair bit, meaning the previous update
needs some further updating. In some cases additional cleanup is also necessary
for future core API changes.

Bug: T96595
Change-Id: I1573e523cf3c945fca95d8d2db002f5abcdbb29d
2015-04-20 14:41:29 -04:00
csteipp 97495d1ff3 Ensure sentences is an int
In the spirit of escaping as close to the output as possible, ensure
that the number of sentences is an integer before using it in a regex.
Just in case someone changes the api's param definition.

Change-Id: I406d6ed365ecd53bd8f56a09218a7e1403fe0fa9
2015-03-24 12:54:54 -07:00
Brad Jorsch c3eb02a9a6 Update ApiResult handling for mediawiki/core change I7b37295e
Change I7b37295e for mediawiki/core deprecates several methods, and more
importantly changes the format of the data returned from
ApiResult::getData(). This change should handle these differences in a
backwards-compatible manner.

Change-Id: I7b37295e8862b188d1f3b0cd07f66ac34629678e
2015-02-17 14:37:22 -05:00
Chad Horohoe d9869ef8d0 Remove obvious function-level profiling
Change-Id: I0c272eb337566eff28d46d198c9aa065ffdbddb2
2015-02-11 08:49:13 -08:00
jenkins-bot 1c58fd6df9 Merge "Don't flatten spans" 2015-01-13 20:40:04 +00:00
Sam Smith 59633e2be9 Don't flatten spans
... so that per-span information for different languages, i.e. lang and
dir attributes aren't lost.

Bug: T59582
Change-Id: If1b04714fdc0f4d581ddb858d8d53f6f340dc10b
2015-01-13 16:31:01 +00:00
Ori Livneh 23dcce746a MWException -> Exception
Change-Id: If111014ef2d7aea5c72bdcf4600a9067e2e21e00
2015-01-09 19:06:21 -08:00
Max Semenik fbd8e93a8b Reorg: move hooks to a separate class, introduce namespaces
Change-Id: Ic784010e79b1168f0e112cf912f463036255eb64
2014-12-31 15:05:19 -08:00