This takes care of several minor annoyances:
* centralizes all the text processing functions which have been
floating all around the code, and adds proper tests
* filters out invisible elements (sometimes used for metadata)
* avoids merging separate words on HTML->text transformation
* adds caching since doing all this transformations could be
processing-intensive for big chunks of HTML. (This might or
might not be a good idea. I haven't done performance tests, so
this might be premature optimization, and increases memory use.
OTOH these functions are often called in situations where an
immediate UI response is expected (such as selecting a size
from the list) so even small delays would be perceivable.
Bug: 63126
Change-Id: I1ef1e3a33efdfea17612df00da6b629bf39e07aa
Mingle: https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards/388
Mingle: https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards/369