Commit graph

6 commits

Author SHA1 Message Date
Brion Vibber e4a25d9ccd Cleanup for r56413 - PDF text extraction support:
* use UtfNormal::cleanUp() for UTF-8 and control char cleanup instead of iconv() and a manual strip
* remove the htmlspecialchars() which looks like it shouldn't be here; this is for internal data storage not HTML output
2009-10-01 23:29:05 +00:00
ThomasV a62a02638d extract text layer from pdf 2009-09-16 13:50:09 +00:00
Brion Vibber 21e92330a0 Improve the metadata handling...
* Use a nice simple PHP array instead of constructing unnecessary XML. This removes the dependency on PHP 5.1.3 for a SimpleXML method.
* Tell pdfinfo to give us metadata encoded in UTF-8. If we start outputting title and creator info this will be nice!
* Tell pdfinfo to give us page size information for all pages (at least through page 99999 :) rather than just the first page
* Make use of that per-page size information so we can properly render pages of differing size. Without this, they get stretched or squooshed in interesting days.
* Rename the pdf_no_xml message to pdf_no_metadata (in English)
2008-02-04 07:29:29 +00:00
Brion Vibber b442152c6e whitespace fixups 2008-02-04 06:40:50 +00:00
Thomas Bleher 0bb4efcb80 PdfHandler fixes:
* require PHP 5.1.3 (for SimpleXMLElement::addChild())
* Properly escape the filename before passing it to the shell
* arrays created by explode() start at 0
2008-01-02 12:58:23 +00:00
Raimond Spekking 971aec48af * Add function for page option ([[Image:nams.pdf|page=x]])
* Naming convention
2007-09-06 06:32:08 +00:00
Renamed from PdfImage.php (Browse further)