- Don't check for file execution, but for command execution. This way
pdfinfo and pdftext work without specifying the path
- Only pipe the stdout content of the commands to the outputfiles
- Exit as failure when the pdfinfo command is available, but it's
execution failed
- Check and log the error output of retrieveMetadata.sh
Bug: T299521
Change-Id: Ia072469f4df6cce51793ab48823c7f4e4e13997b
explode returns an array with one item,
but the empty string is already checked before the explode
Change-Id: I441309978b25754bad04eeba69993913de4d48c3
Combine all 3 shellouts into one script, retrieveMetaData.sh.
The script is executed by /bin/sh by default, it can be changed for
Windows users by setting $wgPdfHandlerShell.
pdftotext is a bit special since it's behavior varies based on the
program's exit code, so save that in a file so we can check it
independently of the overall exit status.
Bug: T289228
Change-Id: I29750bcc282bd5f9b8e2f79aa340869738ea5f5b
XMP extraction does not work for me with libpoppler 0.86, because when
the output of the two commands is concatenated, there is no "Metadata:"
prefix introducing the XMP. It ends up splitting every line of the XML
on colon characters in attribute names, spamming lots of little
properties into the final result.
I can confirm that it's also broken in production.
So, just treat the output of pdfinfo -meta as plain XML.
Change-Id: Ia3df17daed0f27e95294b5d97872ec064c79965c
* Migrate to the new metadata system: override getSizeAndMetadata()
* Use getHandlerState() instead of a custom property on the File object.
* Opt in to metadata splitting. Avoid loading the text item unless it is
really needed.
* In getDimensionInfo(), use getHandlerState() instead of the
WANObjectCache process cache (pcTTL). This is just a
micro-optimisation, informed by profiling, which showed 90 calls to
this function during an image page view.
Depends-On: I876ea5c9d3a1881e278f689d2f8a3ae20240c703
Change-Id: I30d0b0009fcb11c14d14663bd1f2c2a3dfac55d6
* Use ::isSupported() instead of checking for a specific function manually
* Remove mention of the XMPGetInfo hook, which was removed in 4feb2ac7f2224d
Depends-On: Ic9044bf3260d1a474a6c74844949602441ffc865
Change-Id: I4333d427a2039aaffb897a1f41504b74d60c3c8b
PDF metadata querying was done with pdfinfo's "-meta" and "-l" options
at the same time, which was supported in poppler 0.26 but not in
poppler 0.48.
Upstream change: https://bugs.freedesktop.org/show_bug.cgi?id=96801
Local change is to run the two as separate commands, then send the
output together into the existing processing. Should work with older
poppler-utils on Jessie as well as current one on Stretch.
Bug: T117839
Bug: T193200
Change-Id: Ib4ee9cf12ac04304c576087727eff5dc521ae751