Commit graph

46 commits

Author SHA1 Message Date
Gilles Dubuc e3c75f9bb1 Remove X-Content-Dimensions header
Follows-up b8699d160, 6d433ab841.

Bug: T150741
Bug: T167034
Change-Id: Idaed687ddd09ac50748d6f80cdbb63c11078fb22
2017-06-10 15:13:15 +00:00
Gilles Dubuc 6d433ab841 Update getContentHeaders signature
Bug: T150741
Change-Id: I3a62185243d5a0d561da63a5e5ccafed093d41bb
2017-05-24 08:13:14 +02:00
Umherirrender 8c304bd864 Add phpcs and make pass
Change-Id: I2921f450aeb4e896157d81c1ac2b5b5b13c31240
2017-05-20 22:58:19 +02:00
Gilles Dubuc b8699d160b Store original media dimensions as additional header
For storage repos that support headers (such as Swift), this will store the original
media dimensions as an extra custom header, X-Content-Dimensions.
The header is formatted to minimize its length when dealing with multipage
documents, by expressing the information as page ranges keyed by dimensions.

Example for a multipage documents with some pages of different sizes:
X-Content-Dimensions: 1903x899:1-9,11/1903x873:10

Example for a single page document:
X-Content-Dimensions: 800x600:1

Bug: T150741
Change-Id: If4c58ad7048c8233ef2b0f64a252c16f84dcecde
Depends-On: Ic4c6a86557b3705cf75d074753e9ce2ee070a6df
2017-05-10 15:59:19 +00:00
Aaron Schulz fc5ca3c58c Add page dimension caching and avoid metadata tree loading use in doTransform()
Bug: T147296
Change-Id: Ic27f0797317f3467305f953ca6b7ae729a566041
2016-10-04 21:52:10 -07:00
Brian Wolff e1d584ec6c SECURITY: Add -dSAFER to ghostscript as a hardening measure
-dSAFER disables certain scary features of ghostscript
(like arbitrary file access). Its primarily about postscript
security, but enable it for pdfs to be safe.

Bug: T136402
Change-Id: I0ab37ddb5d134334e975bc07d3b9ba7bfc7a5659
2016-08-23 04:30:52 +00:00
Brian Wolff e0ad7bd13d Cast width, height, page to int as paranoia measure
Everything is properly escaped so it doesn't matter, but as an
extra bit of safety, cast width/height/page to int, in order to
ensure under no circumstances would something unexpected be fed
to ghostscript.

Change-Id: I961a3dae5801dd116e1cb6c93808d49268d1e81e
2016-06-13 04:56:31 -04:00
jenkins-bot 81373114ca Merge "Add type hint to getPageText()" 2015-10-28 05:38:56 +00:00
Aaron Schulz 65d64d3332 Add type hint to getPageText()
Change-Id: I9f80843a5891b3f926c915ee1d46c859d91add6a
2015-10-27 13:07:51 -07:00
jenkins-bot 0b0a7a55ca Merge "Add type hint to getPageDimensions()" 2015-10-27 19:43:11 +00:00
Aaron Schulz 04301cc798 Add type hint to getPageDimensions()
Change-Id: I2eecc73f3274f684ef8c426cb4ca5b3785284368
2015-10-27 12:39:11 -07:00
Aaron Schulz ca786d9b80 Add type hint to pageCount()
Change-Id: Ia47b01d74addf394950c997117a5ed6c743befb7
2015-10-27 12:06:06 -07:00
Chad Horohoe 5d286c85c0 Wrap PdfHandler metadata unserialization in PoolCounter
Change-Id: I0ce60a044c64a59bd44eef8cc5bd520d0b46230e
2015-04-24 13:17:55 -07:00
jenkins-bot ad55c98742 Merge "Add warning about PDF files on the file page." 2015-03-26 13:08:44 +00:00
Mark Holmquist f4f87cebcc Add warning about PDF files on the file page.
Depends on I3c4b7af7284b5e16e458dd72de789e74db489895 in core.

Bug: T89765
Change-Id: I674bf7f6c1b21ffc9870aa84382479af5f966561
2015-03-26 14:01:19 +01:00
umherirrender fdabc52509 Pass $context to MediaHandler::formatMetadataHelper
Pass the newly added param to the next function to get it passed to
FormatMetadata

Follow-Up: Ib1f5af01c13cd2a5a4570a4be411ae314a6fc541
Follow-Up: I92774e1a88f03d44967d1797c6c2b8a31c1b10fc
Change-Id: I6801939f8c3e985004f2d57ac6664e298a9996b6
2015-03-18 20:04:42 +01:00
Gilles Dubuc 330f70bb58 Add missing context parameter
Change-Id: Ib1f5af01c13cd2a5a4570a4be411ae314a6fc541
2015-03-17 18:38:43 +01:00
jenkins-bot a0bef60b64 Merge "Add jpeg quality option" 2014-08-31 18:55:31 +00:00
scnd 4348863369 Add jpeg quality option
Change-Id: Ia09bdfc5d9d64f61ab08248033c5a14ed3622dea
2014-08-26 15:28:39 +04:00
Brian Wolff 261af50ba6 Make validation for page more strict on pdf to take only numbers
This change causes wiki syntax like
 [[File:Foo.pdf|thumb|Page 7 of document]]
to be interpreted as a caption instead, of saying select page 7
of the pdf. Previously it eventually ran intval( '7 of document' ),
so flipped to page 7.

Only possible downside I could see is this would cause things like
left-to-right marks and weird unicode spaces to no longer be ignored.
I don't think that's a big deal.

Change-Id: Ib98510a0473458fdc9cdecdb7f75676488b4c5c8
2014-05-14 21:37:14 -03:00
Aaron Schulz ed75f8a1d2 Use standard "GetLocalFileCopy" pool name
Change-Id: Ifb5e95e8989ba6eb076e16b5f57c88ab20d04425
2014-04-11 16:07:13 -07:00
Aaron Schulz ce77a57144 Allow using PoolCounter for large PDF local downloads
Change-Id: Ifee3a5764b32ef31e4ed9edb183272ebb89bd11b
2014-03-15 10:51:16 -07:00
csteipp 7e59dedf95 SECURITY: Escape all shell arguments
Ensure all shell arguments are escaped individually.
This relies on Ica8e37d1c1bea3b68c0165109aa7b9330fe9128a.

Bug: 60339
Change-Id: I80cdb459ebebe8fa480ab2ccad7faab29fcf78fe
2014-01-30 11:23:46 -08:00
tonythomas01 8dae0f163c Change 2>&1 in doTransform to use wfShellExecWithStderr instead
This would allow cgroup related errors to be captured.

Bug: 59986
Change-Id: Ica8e37d1c1bea3b68c0165109aa7b9330fe9128a
2014-01-16 04:44:16 +00:00
Brian Wolff a6d53125e2 Fix warning if pdfinfo fails but pdftext succedes.
Discovered when fixing bug 41281

Change-Id: I8c956da326e5dc339893a010370d399e97e204fd
2013-05-26 11:09:49 +00:00
Reedy 636de72aae Couple of minor bits of code simplfication
Change-Id: Ibb7f4da745997e1a9899ea306edb9721ee0a575a
2013-01-31 10:50:35 +00:00
Reedy c2eae6e584 Tidy up some documentation
Remove @ error suppression

Change-Id: I6da825cf203377b6f25548f9d5112bade4be30d7
2012-10-04 17:34:25 +01:00
Brian Wolff 7663952c34 Display metadata from PDF files on image description page.
This code was left over and forgotten about from my gsoc project
back in 2010 (made minor modifications)

(note about message named: they start with "exif-" to be compatible
built in metadata code, which uses that prefix to be compatible with
the older messages).

Change-Id: I9e546d9e6ae9a60604c9dd1633cb2225c9d1109d
2012-10-04 17:10:12 +01:00
Siebrand Mazeland 7ac97b1391 Replace deprecated wfMsg with wfMessage.
Change-Id: Ieb5b1d632dbdaf5414d743ee05c0a033f37a7933
2012-09-02 01:07:57 +02:00
Reedy 91c18dc1a0 Method documentation etc
Change-Id: I8b9d111fcce6ce46fd85a128ad21ce523064ab31
2012-07-26 23:30:11 +01:00
Aaron Schulz da2c37465f * Update doTransform() for FileBackend changes
* Fixed case type in function call
2012-02-05 20:29:51 +00:00
Sam Reed 5e581298d1 More wfMkdirParents() __METHOD__ additions 2011-07-25 22:09:05 +00:00
Sam Reed 64e5549b32 Fixup some missing variable definitions 2011-01-23 10:33:37 +00:00
Sam Reed 2e90b4ecb1 Start removing/fixing calls to deprecated methods in WMF used extensions 2010-10-29 15:14:44 +00:00
Jack Phoenix 61e7480e4d PdfHandler: coding style tweaks & general cleanup 2010-10-12 13:48:00 +00:00
Siebrand Mazeland 25d4892611 Follow-up r68409 for extensions: shut up PHP Strict Standards notices about "Declaration of Blah::getThumbType() should be compatible with that of MediaHandler::getThumbType()" 2010-06-23 18:23:58 +00:00
Chad Horohoe 1eb60a2ce6 Revert r66934 (Removing wfLoadExtensionMessages() from everything). I disagree on principle...we branch extensions for this very reason. But people want trunk extensions compatible for several versions back, meh. 2010-05-27 15:56:53 +00:00
Chad Horohoe ecdda8f2ff Large commit. Removed 800+ references to no-op wfLoadExtensionMessages() 2010-05-26 22:25:32 +00:00
Conrad Irwin 646ff068e7 Fix strict standards declarations 2010-05-19 23:09:12 +00:00
ThomasV a62a02638d extract text layer from pdf 2009-09-16 13:50:09 +00:00
Brion Vibber e7284509f1 Some cleanup...
* remove no longer accurate code comment
* set default values for the processor commands, otherwise it's totally confusing that not only doesn't it work, but it looks like it's just not installed
* added a comment about adding to $wgFileExtensions -- not doing it automatically since some wikis might want to *use* PDFs from a shared server without locally uploading them.... mebbe... dunno... keeping this one open for now. ;)
2008-12-25 00:29:44 +00:00
Brion Vibber fadc8e07fa Don't load extension messages until they're needed for error messages...
Direct GS and Convert errors to stdout so we can pick them up and report them
2008-02-05 00:16:54 +00:00
Brion Vibber 21e92330a0 Improve the metadata handling...
* Use a nice simple PHP array instead of constructing unnecessary XML. This removes the dependency on PHP 5.1.3 for a SimpleXML method.
* Tell pdfinfo to give us metadata encoded in UTF-8. If we start outputting title and creator info this will be nice!
* Tell pdfinfo to give us page size information for all pages (at least through page 99999 :) rather than just the first page
* Make use of that per-page size information so we can properly render pages of differing size. Without this, they get stretched or squooshed in interesting days.
* Rename the pdf_no_xml message to pdf_no_metadata (in English)
2008-02-04 07:29:29 +00:00
Brion Vibber b442152c6e whitespace fixups 2008-02-04 06:40:50 +00:00
Siebrand Mazeland bf4d2f95b9 * use wfLoadExtensionMessages
* delay message loading
* add version in extension credits
2008-01-11 17:40:39 +00:00
Raimond Spekking 971aec48af * Add function for page option ([[Image:nams.pdf|page=x]])
* Naming convention
2007-09-06 06:32:08 +00:00