Disables sentence termination at a full stop preceeded by a capital
alphabet which is likely to be an initial.
Bug: T115795
Change-Id: Ibf38e87823155c704ffb106642944cbd05e3f632
Allows sentences to end with numbers before a full stop in query
extractsentences.
Also added some more unit tests.
Bug: T118621
Change-Id: I9cbf487601d4165b490696d38d5fcbcf6d8f4637
... so that per-span information for different languages, i.e. lang and
dir attributes aren't lost.
Bug: T59582
Change-Id: If1b04714fdc0f4d581ddb858d8d53f6f340dc10b