Commit graph

3 commits

Author SHA1 Message Date
Derk-Jan Hartman f87fc5a6ad Improve logging for Pdf's retrieveMetadata.sh
- Don't check for file execution, but for command execution. This way
  pdfinfo and pdftext work without specifying the path
- Only pipe the stdout content of the commands to the outputfiles
- Exit as failure when the pdfinfo command is available, but it's
  execution failed
- Check and log the error output of retrieveMetadata.sh

Bug: T299521
Change-Id: Ia072469f4df6cce51793ab48823c7f4e4e13997b
2024-03-16 09:37:34 +00:00
Derk-Jan Hartman b846970ae2 Use the PDF cropbox for rendering
By default the mediabox is used. This is the full potential area of
pages, as also used by PDF editors and can contain areas outside of
the page.
The cropbox is also the size that is reported by pdfinfo as the
pagesize.

Bug: T167420
Change-Id: I92267a9dbe81b6e0e471b8eae1e4c2ba4e5d84e9
2022-06-15 18:39:35 +00:00
Kunal Mehta b253dc04c4 Port retrieveMetaData to BoxedCommand
Combine all 3 shellouts into one script, retrieveMetaData.sh.

The script is executed by /bin/sh by default, it can be changed for
Windows users by setting $wgPdfHandlerShell.

pdftotext is a bit special since it's behavior varies based on the
program's exit code, so save that in a file so we can check it
independently of the overall exit status.

Bug: T289228
Change-Id: I29750bcc282bd5f9b8e2f79aa340869738ea5f5b
2021-09-20 10:28:27 -07:00