X-Git-Url: http://git.rohieb.name/www-rohieb-name.git/blobdiff_plain/34ed3458bf9a725a9d2c4883bf2a99a69f16455c..HEAD:/blag/post/optimizing-xsane-s-scanned-pdfs.mdwn

diff --git a/blag/post/optimizing-xsane-s-scanned-pdfs.mdwn b/blag/post/optimizing-xsane-s-scanned-pdfs.mdwn
index e48c024..5169a60 100644
--- a/blag/post/optimizing-xsane-s-scanned-pdfs.mdwn
+++ b/blag/post/optimizing-xsane-s-scanned-pdfs.mdwn
@@ -23,7 +23,7 @@ First (non-optimal) solution
 --------------
 
 At first, I tried to optimize the PDF using [GhostScript][gs]. I
-[[use-ghostscript-to-convert-pdf-files|already wrote]] about how GhostScriptâs
+[[already wrote|use-ghostscript-to-convert-pdf-files]] about how GhostScriptâs
 `-dPDFSETTINGS` option can be used to minimize PDFs by redering the pictures to
 a smaller resolution. In fact, there are [multiple rendering modes][gs-ps-pdf]
 (`screen` for 96&nbsp;dpi, `ebook` for 150&nbsp;dpi, `printer` for 300&nbsp;dpi,
@@ -334,9 +334,27 @@ in X and Y direction, which was the resolution at which the images were scanned:
 
     $ convert image*jpg -density 200x200 document.pdf
 
+*Update:* You can also use the [`-page` parameter][page] to set the page size
+directly. It takes a multitude of predefined paper formats (see link) and will
+do the pixel density calculation for you, as well as adding any neccessary
+offset if the image ratio is not quite exact:
+
+    $ convert image*jpg -page A4 document.pdf
+
 With that approach, I could reduce the size of my PDF from 250&nbsp;MB with
 losslessly compressed images to 38&nbsp;MB with DCT compression.
 
+*Another update (2023):* Marcus notified me that it is possible to use
+ImageMagick's `-compress jpeg` option, this way we can leave out the
+intermediate step and convert PNM to PDF directly:
+
+    $ convert image*.pnm -compress jpeg -quality 85 output.pdf
+
+You can also play around with the `-quality` parameter to set the JPEG
+compression level (100% makes almost pristine, but huge images; 1% makes very
+small, very blocky images), 85% should still be readable for most documents
+in that resolution.
+
 Too long, didnât read
 -----------------
 
@@ -368,5 +386,6 @@ document.
 [scan-to-pdfa]: http://blog.konradvoelkel.de/2013/03/scan-to-pdfa/ "Konrad Voelkel: Linux, OCR and PDF: Scan to PDF/A"
 [pdf-stream-objects]: http://blog.didierstevens.com/2008/05/19/pdf-stream-objects/ "Didier Stevens: PDF Stream Objects"
 [pdf-tools]: http://blog.didierstevens.com/programs/pdf-tools/ "Didier Stevens: PDF Tools"
+[page]: http://www.imagemagick.org/script/command-line-options.php#page "ImageMagick: Command-line Options"
 
 [[!tag PDF note_to_self howto ImageMagic convert file_formats reference longpost]]