projects
/
www-rohieb-name.git
/ commitdiff
commit
grep
author
committer
pickaxe
?
search:
re
summary
|
shortlog
|
log
|
commit
| commitdiff |
tree
raw
|
patch
|
inline
| side by side (from parent 1:
dd6de2c
)
amend blag post "Optimizing PDFs": typos etc.
author
Roland Hieber
<rohieb@rohieb.name>
Fri, 22 Nov 2013 05:26:05 +0000
(06:26 +0100)
committer
Roland Hieber
<rohieb@rohieb.name>
Fri, 22 Nov 2013 05:26:05 +0000
(06:26 +0100)
blag/post/optimizing-xsane-s-scanned-pdfs.mdwn
patch
|
blob
|
history
diff --git
a/blag/post/optimizing-xsane-s-scanned-pdfs.mdwn
b/blag/post/optimizing-xsane-s-scanned-pdfs.mdwn
index
529e873
..
c9db67b
100644
(file)
--- a/
blag/post/optimizing-xsane-s-scanned-pdfs.mdwn
+++ b/
blag/post/optimizing-xsane-s-scanned-pdfs.mdwn
@@
-66,6
+66,7
@@
Strings
literal \) and some\n newlines.\n)`.
* interpreted as hexadecimal data when enclosed in angled brackets:
`<53 61 6D 70 6C 65>` equals `(Sample)`.
literal \) and some\n newlines.\n)`.
* interpreted as hexadecimal data when enclosed in angled brackets:
`<53 61 6D 70 6C 65>` equals `(Sample)`.
+
Names
: starting with a forward slash, like `/Type`. You can think of them like
identifiers in programming languages.
Names
: starting with a forward slash, like `/Type`. You can think of them like
identifiers in programming languages.
@@
-145,7
+146,7
@@
EOF]]
This is just the magic string declaring the document as PDF-1.4, and the root
object with object number 1, which references objects number 2 for Outlines and
This is just the magic string declaring the document as PDF-1.4, and the root
object with object number 1, which references objects number 2 for Outlines and
-number 3 for
p
ages. We're not interested in outlines, let's look at the pages.
+number 3 for
P
ages. We're not interested in outlines, let's look at the pages.
[[!format pdf <<EOF
3 0 obj
[[!format pdf <<EOF
3 0 obj
@@
-195,8
+196,7
@@
BI
/BPC 8
/F /FlateDecode
ID
/BPC 8
/F /FlateDecode
ID
-x
\9c
$¼[
\8b
$;¾åù!
\ 6
f
\9e
ú¥
\87
¡a
\1e
\ 6
æátq.4§
-% [ ...byte stream shortened... ]
+x$¼[$;¾åù!fú¥¡aæátq.4§ [ ...byte stream shortened... ]
EI
Q
endstream
EI
Q
endstream
@@
-226,7
+226,7
@@
Q % Restore drawing context
EOF]]
So now we know why the PDF was so huge: the line `/F /FlateDecode` tells us that
EOF]]
So now we know why the PDF was so huge: the line `/F /FlateDecode` tells us that
-the image ata is stored losslessly with [Deflate][] compression (this is
+the image
d
ata is stored losslessly with [Deflate][] compression (this is
basically what PNG uses). However, scanned images, as well as photographed
pictures, have the tendency to become very big when stored losslessly, due to te
fact that image sensors always add noise from the universe and lossless
basically what PNG uses). However, scanned images, as well as photographed
pictures, have the tendency to become very big when stored losslessly, due to te
fact that image sensors always add noise from the universe and lossless
@@
-264,7
+264,7
@@
undocumented in the [man page][man-convert]). In that case it tries to create
multi-page documents, if possible. With PDF as output format, this results in
one input file per page.
multi-page documents, if possible. With PDF as output format, this results in
one input file per page.
-[man-convert
ed
]: http://manpages.debian.net/cgi-bin/man.cgi?query=convert "man convert(1)"
+[man-convert]: http://manpages.debian.net/cgi-bin/man.cgi?query=convert "man convert(1)"
The embedded image objects looked somewhat like the following:
The embedded image objects looked somewhat like the following:
@@
-301,7
+301,7
@@
Next, I converted the PNMs to JPG, then to PDF.
$ convert image*jpg document.pdf
(The first command creates the output files `image-1.jpg`, `image-2.jpg`, etc.,
$ convert image*jpg document.pdf
(The first command creates the output files `image-1.jpg`, `image-2.jpg`, etc.,
-since JPG does n
u
t support multiple pages in one file.)
+since JPG does n
o
t support multiple pages in one file.)
When looking at the PDF, we see that we now have DCT-compressed images inside
the PDF:
When looking at the PDF, we see that we now have DCT-compressed images inside
the PDF:
This page took
0.028889 seconds
and
4
git commands to generate.