aboutsummaryrefslogtreecommitdiffstats
path: root/manual/qpdf-manual.xml
diff options
context:
space:
mode:
Diffstat (limited to 'manual/qpdf-manual.xml')
-rw-r--r--manual/qpdf-manual.xml2020
1 files changed, 1952 insertions, 68 deletions
diff --git a/manual/qpdf-manual.xml b/manual/qpdf-manual.xml
index 848c340b..d43d96a6 100644
--- a/manual/qpdf-manual.xml
+++ b/manual/qpdf-manual.xml
@@ -5,8 +5,8 @@
<!ENTITY mdash "&#x2014;">
<!ENTITY ndash "&#x2013;">
<!ENTITY nbsp "&#xA0;">
-<!ENTITY swversion "8.1.0">
-<!ENTITY lastreleased "June 22, 2018">
+<!ENTITY swversion "8.4.0">
+<!ENTITY lastreleased "February 1, 2019">
]>
<book>
<bookinfo>
@@ -16,7 +16,7 @@
<firstname>Jay</firstname><surname>Berkenbilt</surname>
</author>
<copyright>
- <year>2005&ndash;2018</year>
+ <year>2005&ndash;2019</year>
<holder>Jay Berkenbilt</holder>
</copyright>
</bookinfo>
@@ -196,15 +196,6 @@
ghostscript.
</para>
<para>
- If Adobe Reader is installed as <command>acroread</command>, some
- additional test cases will be enabled. These test cases simply
- verify that Adobe Reader can open the files that qpdf creates.
- They require version 8.0 or newer to pass. However, in order to
- avoid having qpdf depend on non-free (as in liberty) software, the
- test suite will still pass without Adobe reader, and the test
- suite still exercises the full functionality of the software.
- </para>
- <para>
Pre-built documentation is distributed with qpdf, so you should
generally not need to rebuild the documentation. In order to
build the documentation from its docbook sources, you need the
@@ -251,6 +242,56 @@ make
top-level <filename>Makefile</filename>.
</para>
</sect1>
+ <sect1 id="ref.packaging">
+ <title>Notes for Packagers</title>
+ <para>
+ If you are packaging qpdf for an operating system distribution,
+ here are some things you may want to keep in mind:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Passing <option>--enable-show-failed-test-output</option> to
+ <command>./configure</command> will cause any failed test
+ output to be written to the console. This can be very useful
+ for seeing test failures generated by autobuilders where you
+ can't access qtest.log after the fact.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If qpdf's build environment detects the presence of autoconf
+ and related tools, it will check to ensure that automatically
+ generated files are up-to-date with recorded checksums and fail
+ if it detects a discrepancy. This feature is intended to
+ prevent you from accidentally forgetting to regenerate
+ automatic files after modifying their sources. If your
+ packaging environment automatically refreshes automatic files,
+ it can cause this check to fail. Suppress qpdf's checks by
+ passing <option>--disable-check-autofiles</option> to
+ <command>/.configure</command>. This is safe since qpdf's
+ <command>autogen.sh</command> just runs autotools in the normal
+ way.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ QPDF's <command>make install</command> does not install
+ completion files by default, but as a packager, it's good if
+ you install them wherever your distribution expects such files
+ to go. You can find completion files to install in the
+ <filename>completions</filename> directory.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Packagers are encouraged to install the source files from the
+ <filename>examples</filename> directory along with qpdf
+ development packages.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect1>
</chapter>
<chapter id="ref.using">
<title>Running QPDF</title>
@@ -300,6 +341,19 @@ make
inspection commands do not. These are specifically noted.
</para>
</sect1>
+ <sect1 id="ref.shell-completion">
+ <title>Shell Completion</title>
+ <para>
+ Starting in qpdf version 8.3.0, qpdf provides its own completion
+ support for zsh and bash. You can enable bash completion with
+ <command>eval $(qpdf --completion-bash)</command> and zsh
+ completion with <command>eval $(qpdf --completion-zsh)</command>.
+ If <command>qpdf</command> is not in your path, you should invoke
+ it above with an absolute path. If you invoke it with a relative
+ path, it will warn you, and the completion won't work if you're in
+ a different directory.
+ </para>
+ </sect1>
<sect1 id="ref.basic-options">
<title>Basic Options</title>
<para>
@@ -307,6 +361,48 @@ make
commonly needed transformations.
<variablelist>
<varlistentry>
+ <term><option>--help</option></term>
+ <listitem>
+ <para>
+ Display command-line invocation help.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--version</option></term>
+ <listitem>
+ <para>
+ Display the current version of qpdf.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--copyright</option></term>
+ <listitem>
+ <para>
+ Show detailed copyright information.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--completion-bash</option></term>
+ <listitem>
+ <para>
+ Output a completion command you can eval to enable shell
+ completion from bash.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--completion-zsh</option></term>
+ <listitem>
+ <para>
+ Output a completion command you can eval to enable shell
+ completion from zsh.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><option>--password=password</option></term>
<listitem>
<para>
@@ -328,6 +424,24 @@ make
</listitem>
</varlistentry>
<varlistentry>
+ <term><option>--progress</option></term>
+ <listitem>
+ <para>
+ Indicate progress while writing files.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--no-warn</option></term>
+ <listitem>
+ <para>
+ Suppress writing of warnings to stderr. If warnings were
+ detected and suppressed, <command>qpdf</command> will still
+ exit with exit code 3.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><option>--linearize</option></term>
<listitem>
<para>
@@ -412,6 +526,83 @@ make
</listitem>
</varlistentry>
<varlistentry>
+ <term><option>--suppress-password-recovery</option></term>
+ <listitem>
+ <para>
+ Ordinarily, qpdf attempts to automatically compensate for
+ passwords specified in the wrong character encoding. This
+ option suppresses that behavior. Under normal conditions,
+ there are no reasons to use this option. See <xref
+ linkend="ref.unicode-passwords"/> for a discussion
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--password-mode=<replaceable>mode</replaceable></option></term>
+ <listitem>
+ <para>
+ This option can be used to fine-tune how qpdf interprets
+ Unicode (non-ASCII) password strings passed on the command
+ line. With the exception of the <option>hex-bytes</option>
+ mode, these only apply to passwords provided when encrypting
+ files. The <option>hex-bytes</option> mode also applies to
+ passwords specified for reading files. For additional
+ discussion of the supported password modes and when you might
+ want to use them, see <xref linkend="ref.unicode-passwords"/>.
+ The following modes are supported:
+ <itemizedlist>
+ <listitem>
+ <para>
+ <option>auto</option>: Automatically determine whether the
+ specified password is a properly encoded Unicode (UTF-8)
+ string, and transcode it as required by the PDF spec based
+ on the type encryption being applied. On Windows starting
+ with version 8.4.0, and on almost all other modern
+ platforms, incoming passwords will be properly encoded in
+ UTF-8, so this is almost always what you want.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <option>unicode</option>: Tells qpdf that the incoming
+ password is UTF-8, overriding whatever its automatic
+ detection determines. The only difference between this mode
+ and <option>auto</option> is that qpdf will fail with an
+ error message if the password is not valid UTF-8 instead of
+ falling back to <option>bytes</option> mode with a warning.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <option>bytes</option>: Interpret the password as a literal
+ byte string. For non-Windows platforms, this is what
+ versions of qpdf prior to 8.4.0 did. For Windows platforms,
+ there is no way to specify strings of binary data on the
+ command line directly, but you can use the
+ <option>@filename</option> option to do it, in which case
+ this option forces qpdf to respect the string of bytes as
+ provided. This option will allow you to encrypt PDF files
+ with passwords that will not be usable by other readers.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <option>hex-bytes</option>: Interpret the password as a
+ hex-encoded string. This provides a way to pass binary data
+ as a password on all platforms including Windows. As with
+ <option>bytes</option>, this option may allow creation of
+ files that can't be opened by other readers. This mode
+ affects qpdf's interpretation of passwords specified for
+ decrypting files as well as for encrypting them. It makes
+ it possible to specify strings that are encoded in some
+ manner other than the system's default encoding.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><option>--rotate=[+|-]angle[:page-range]</option></term>
<listitem>
<para>
@@ -436,6 +627,35 @@ make
</listitem>
</varlistentry>
<varlistentry>
+ <term><option>--keep-files-open=<replaceable>[yn]</replaceable></option></term>
+ <listitem>
+ <para>
+ This option controls whether qpdf keeps individual files open
+ while merging. Prior to version 8.1.0, qpdf always kept all
+ files open, but this meant that the number of files that could
+ be merged was limited by the operating system's open file
+ limit. Version 8.1.0 opened files as they were referenced and
+ closed them after each read, but this caused a major
+ performance impact. Version 8.2.0 optimized the performance
+ but did so in a way that, for local file systems, there was a
+ small but unavoidable performance hit, but for networked file
+ systems, the performance impact could be very high. Starting
+ with version 8.2.1, the default behavior is that files are
+ kept open if no more than 200 files are specified, but that
+ the behavior can be explicitly overridden with the
+ <option>--keep-files-open</option> flag. If you are merging
+ more than 200 files but less than the operating system's max
+ open files limit, you may want to use
+ <option>--keep-files-open=y</option>, especially if working
+ over a networked file system. If you are using a local file
+ system where the overhead is low and you might sometimes merge
+ more than the OS limit's number of files from a script and are
+ not worried about a few seconds additional processing time,
+ you may want to specify <option>--keep-files-open=n</option>.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><option>--pages options --</option></term>
<listitem>
<para>
@@ -446,6 +666,16 @@ make
</listitem>
</varlistentry>
<varlistentry>
+ <term><option>--collate</option></term>
+ <listitem>
+ <para>
+ When specified, collate rather than concatenate pages from
+ files specified with <option>--pages</option>. See <xref
+ linkend="ref.page-selection"/> for additional details.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><option>--split-pages=[n]</option></term>
<listitem>
<para>
@@ -518,6 +748,26 @@ make
</para>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>--overlay options --</option></term>
+ <listitem>
+ <para>
+ Overlay pages from another file onto the output pages. See
+ <xref linkend="ref.overlay-underlay"/> for details on
+ overlay/underlay.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--underlay options --</option></term>
+ <listitem>
+ <para>
+ Overlay pages from another file onto the output pages. See
+ <xref linkend="ref.overlay-underlay"/> for details on
+ overlay/underlay.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
<para>
@@ -537,22 +787,17 @@ make
producers.
</para>
<para>
- In all cases where qpdf allows specification of a password, care
- must be taken if the password contains characters that fall
- outside of the 7-bit US-ASCII character range to ensure that the
- exact correct byte sequence is provided. It is possible that a
- future version of qpdf may handle this more gracefully. For
- example, if a password was encrypted using a password that was
- encoded in ISO-8859-1 and your terminal is configured to use
- UTF-8, the password you supply may not work properly. There are
- various approaches to handling this. For example, if you are
- using Linux and have the iconv executable installed, you could
- pass <option>--password=`echo <replaceable>password</replaceable>
- | iconv -t iso-8859-1`</option> to qpdf where
- <replaceable>password</replaceable> is a password specified in
- your terminal's locale. A detailed discussion of this is out of
- scope for this manual, but just be aware of this issue if you have
- trouble with a password that contains 8-bit characters.
+ Prior to 8.4.0, in the case of passwords that contain characters
+ that fall outside of 7-bit US-ASCII, qpdf left the burden of
+ supplying properly encoded encryption and decryption passwords to
+ the user. Starting in qpdf 8.4.0, qpdf does this automatically in
+ most cases. For an in-depth discussion, please see <xref
+ linkend="ref.unicode-passwords"/>. Previous versions of this
+ manual described workarounds using the <command>iconv</command>
+ command. Such workarounds are no longer required or recommended
+ with qpdf 8.4.0. However, for backward compatibility, qpdf
+ attempts to detect those workarounds and do the right thing in
+ most cases.
</para>
</sect1>
<sect1 id="ref.encryption-options">
@@ -624,7 +869,12 @@ make
<listitem>
<para>
Determines whether or not to allow accessibility to visually
- impaired.
+ impaired. The qpdf library disregards this field when AES is
+ used or when 256-bit encryption is used. You should really
+ never disable accessibility, but qpdf lets you do it in case
+ you need to configure a file this way for testing purposes.
+ The PDF spec says that conforming readers should disregard
+ this permission and always allow accessibility.
</para>
</listitem>
</varlistentry>
@@ -637,6 +887,45 @@ make
</listitem>
</varlistentry>
<varlistentry>
+ <term><option>--assemble=[yn]</option></term>
+ <listitem>
+ <para>
+ Determines whether document assembly (rotation and reordering
+ of pages) is allowed.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--annotate=[yn]</option></term>
+ <listitem>
+ <para>
+ Determines whether modifying annotations is allowed. This
+ includes adding comments and filling in form fields. Also
+ allows editing of form fields if
+ <option>--modify-other=y</option> is given.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--form=[yn]</option></term>
+ <listitem>
+ <para>
+ Determines whether filling form fields is allowed.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--modify-other=[yn]</option></term>
+ <listitem>
+ <para>
+ Allow all document editing except those controlled separately
+ by the <option>--assemble</option>,
+ <option>--annotate</option>, and <option>--form</option>
+ options.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><option>--print=<replaceable>print-opt</replaceable></option></term>
<listitem>
<para>
@@ -667,10 +956,10 @@ make
<term><option>--modify=<replaceable>modify-opt</replaceable></option></term>
<listitem>
<para>
- Controls modify access.
+ Controls modify access. This way of controlling modify access
+ has less granularity than new options added in qpdf 8.4.
<option><replaceable>modify-opt</replaceable></option> may be
- one of the following, each of which implies all the options
- that follow it:
+ one of the following:
<itemizedlist>
<listitem>
<para>
@@ -679,12 +968,14 @@ make
</listitem>
<listitem>
<para>
- <option>annotate</option>: allow comment authoring and form operations
+ <option>annotate</option>: allow comment authoring, form
+ operations, and document assembly
</para>
</listitem>
<listitem>
<para>
<option>form</option>: allow form field fill-in and signing
+ and document assembly
</para>
</listitem>
<listitem>
@@ -698,6 +989,12 @@ make
</para>
</listitem>
</itemizedlist>
+ Using the <option>--modify</option> option does not allow you
+ to create certain combinations of permissions such as allowing
+ form filling but not allowing document assembly. Starting with
+ qpdf 8.4, you can either just use the other options to control
+ fields individually, or you can use something like
+ <option>--modify=form --assembly=n</option> to fine tune.
</para>
</listitem>
</varlistentry>
@@ -794,6 +1091,11 @@ make
selection flags.
</para>
<para>
+ Starting with qpf 8.4, the special input file name
+ &ldquo;<filename>.</filename>&rdquo; can be used shortcut for the
+ primary input filename.
+ </para>
+ <para>
For each file that pages should be taken from, specify the file, a
password needed to open the file (if any), and a page range. The
password needs to be given only once per file. If any of the
@@ -816,15 +1118,6 @@ make
<command>qpdf --empty out.pdf --pages *.pdf --</command>.
</para>
<para>
- It is not presently possible to specify the same page from the
- same file directly more than once, but you can make this work by
- specifying two different paths to the same file (such as by
- putting <filename>./</filename> somewhere in the path). This can
- also be used if you want to repeat a page from one of the input
- files in the output file. This may be made more convenient in a
- future version of qpdf if there is enough demand for this feature.
- </para>
- <para>
The page range is a set of numbers separated by commas, ranges of
numbers separated dashes, or combinations of those. The character
&ldquo;z&rdquo; represents the last page. A number preceded by an
@@ -864,25 +1157,56 @@ make
</itemizedlist>
</para>
<para>
- Note that qpdf doesn't presently do anything special about other
- constructs in a PDF file that may know about pages, so semantics
- of splitting and merging vary across features. For example, the
- document's outlines (bookmarks) point to actual page objects, so
- if you select some pages and not others, bookmarks that point to
- pages that are in the output file will work, and remaining
- bookmarks will not work. On the other hand, page labels (page
- numbers specified in the file) are just sequential, so page labels
- will be messed up in the output file. A future version of
- <command>qpdf</command> may do a better job at handling these
- issues. (Note that the qpdf library already contains all of the
- APIs required in order to implement this in your own application
- if you need it.) In the mean time, you can always use
- <option>--empty</option> as the primary input file to avoid
- copying all of that from the first file. For example, to take
- pages 1 through 5 from a <filename>infile.pdf</filename> while
- preserving all metadata associated with that file, you could use
+ Starting in qpdf version 8.3, you can specify the
+ <option>--collate</option> option. Note that this option is
+ specified outside of <option>--pages&nbsp;...&nbsp;--</option>.
+ When <option>--collate</option> is specified, it changes the
+ meaning of <option>--pages</option> so that the specified files,
+ as modified by page ranges, are collated rather than concatenated.
+ For example, if you add the files <filename>odd.pdf</filename> and
+ <filename>even.pdf</filename> containing odd and even pages of a
+ document respectively, you could run <command>qpdf --collate
+ odd.pdf --pages odd.pdf even.pdf -- all.pdf</command> to collate
+ the pages. This would pick page 1 from odd, page 1 from even, page
+ 2 from odd, page 2 from even, etc. until all pages have been
+ included. Any number of files and page ranges can be specified. If
+ any file has fewer pages, that file is just skipped when its pages
+ have all been included. For example, if you ran <command>qpdf
+ --collate --empty --pages a.pdf 1-5 b.pdf 6-4 c.pdf r1 --
+ out.pdf</command>, you would get the following pages in this
+ order:
+ <itemizedlist>
+ <listitem><para>a.pdf page 1</para></listitem>
+ <listitem><para>b.pdf page 6</para></listitem>
+ <listitem><para>c.pdf last page</para></listitem>
+ <listitem><para>a.pdf page 2</para></listitem>
+ <listitem><para>b.pdf page 5</para></listitem>
+ <listitem><para>a.pdf page 3</para></listitem>
+ <listitem><para>b.pdf page 4</para></listitem>
+ <listitem><para>a.pdf page 4</para></listitem>
+ <listitem><para>a.pdf page 5</para></listitem>
+ </itemizedlist>
+ </para>
+ <para>
+ Starting in qpdf version 8.3, when you split and merge files, any
+ page labels (page numbers) are preserved in the final file. It is
+ expected that more document features will be preserved by
+ splitting and merging. In the mean time, semantics of splitting
+ and merging vary across features. For example, the document's
+ outlines (bookmarks) point to actual page objects, so if you
+ select some pages and not others, bookmarks that point to pages
+ that are in the output file will work, and remaining bookmarks
+ will not work. A future version of <command>qpdf</command> may do
+ a better job at handling these issues. (Note that the qpdf library
+ already contains all of the APIs required in order to implement
+ this in your own application if you need it.) In the mean time,
+ you can always use <option>--empty</option> as the primary input
+ file to avoid copying all of that from the first file. For
+ example, to take pages 1 through 5 from a
+ <filename>infile.pdf</filename> while preserving all metadata
+ associated with that file, you could use
- <programlisting><command>qpdf</command> <option>infile.pdf --pages infile.pdf 1-5 -- outfile.pdf</option>
+ <programlisting><command>qpdf</command> <option>infile.pdf --pages . 1-5 -- outfile.pdf</option>
</programlisting>
If you wanted pages 1 through 5 from
<filename>infile.pdf</filename> but you wanted the rest of the
@@ -894,13 +1218,13 @@ make
<filename>file1.pdf</filename> and pages 11&ndash;15 from
<filename>file2.pdf</filename> in reverse, you would run
- <programlisting><command>qpdf</command> <option>file1.pdf --pages file1.pdf 1-5 file2.pdf 15-11 -- outfile.pdf</option>
+ <programlisting><command>qpdf</command> <option>file1.pdf --pages file1.pdf 1-5 . 15-11 -- outfile.pdf</option>
</programlisting>
If, for some reason, you wanted to take the first page of an
encrypted file called <filename>encrypted.pdf</filename> with
password <literal>pass</literal> and repeat it twice in an output
- file, and if you wanted to drop metadata (like page numbers and
- outlines) but preserve encryption, you would use
+ file, and if you wanted to drop document-level metadata but
+ preserve encryption, you would use
<programlisting><command>qpdf</command> <option>--empty --copy-encryption=encrypted.pdf --encryption-file-password=pass
--pages encrypted.pdf --password=pass 1 ./encrypted.pdf --password=pass 1 --
@@ -914,6 +1238,100 @@ outfile.pdf</option>
are all corner cases that most users should hopefully never have
to be bothered with.
</para>
+ <para>
+ Prior to version 8.4, it was not possible to specify the same page
+ from the same file directly more than once, and the workaround of
+ specifying the same file in more than one way was required.
+ Version 8.4 removes this limitation.
+ </para>
+ </sect1>
+ <sect1 id="ref.overlay-underlay">
+ <title>Overlay and Underlay Options</title>
+ <para>
+ Starting with qpdf 8.4, it is possible to overlay or underlay
+ pages from other files onto the output generated by qpdf. Specify
+ overlay or underlay as follows:
+
+ <programlisting>{ <option>--overlay</option> | <option>--underlay</option> } <replaceable>file</replaceable> [ <option>options</option> ] <option>--</option>
+</programlisting>
+ Overlay and underlay options are processed late, so they can be
+ combined with other like merging and will apply to the final
+ output. The <option>--overlay</option> and
+ <option>--underlay</option> options work the same way, except
+ underlay pages are drawn underneath the page to which they are
+ applied, possibly obscured by the original page, and overlay files
+ are drawn on top of the page to which they are applied, possibly
+ obscuring the page. You can combine overlay and underlay.
+ </para>
+ <para>
+ The default behavior of overlay and underlay is that pages are
+ taken from the overlay/underlay file in sequence and applied to
+ corresponding pages in the output until there are no more output
+ pages. If the overlay or underlay file runs out of pages,
+ remaining output pages are left alone. This behavior can be
+ modified by options, which are provided between the
+ <option>--overlay</option> or <option>--underlay</option> flag and
+ the <option>--</option> option. The following options are
+ supported:
+ <itemizedlist>
+ <listitem>
+ <para>
+ <option>--password=password</option>: supply a password if the
+ overlay/underlay file is encrypted.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <option>--to=page-range</option>: a range of pages in the same
+ form at described in <xref linkend="ref.page-selection"/>
+ indicates which pages in the output should have the
+ overlay/underlay applied. If not specified, overlay/underlay
+ are applied to all pages.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <option>--from=[page-range]</option>: a range of pages that
+ specifies which pages in the overlay/underlay file will be used
+ for overlay or underlay. If not specified, all pages will be
+ used. This can be explicitly specified to be empty if
+ <option>--repeat</option> is used.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <option>--repeat=page-range</option>: an optional range of
+ pages that specifies which pages in the overlay/underlay file
+ will be repeated after the &ldquo;from&rdquo; pages are used
+ up. If you want to repeat a range of pages starting at the
+ beginning, you can explicitly use <option>--from=</option>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <para>
+ Here are some examples.
+ <itemizedlist>
+ <listitem>
+ <para>
+ <command>--overlay o.pdf --to=1-5 --from=1-3
+ --repeat=4 --</command>: overlay the first three pages from file
+ <filename>o.pdf</filename> onto the first three pages of the
+ output, then overlay page 4 from <filename>o.pdf</filename>
+ onto pages 4 and 5 of the output. Leave remaining output pages
+ untouched.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <command>--underlay footer.pdf --from= --repeat=1,2 --</command>:
+ Underlay page 1 of <filename>footer.pdf</filename> on all odd
+ output pages, and underlay page 2 of
+ <filename>footer.pdf</filename> on all even output pages.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
</sect1>
<sect1 id="ref.advanced-parsing">
<title>Advanced Parsing Options</title>
@@ -1145,8 +1563,8 @@ outfile.pdf</option>
case, please report it as a bug.
</para>
<para>
- See also <option>--preserve-unreferenced-resources</option>,
- which does something completely different.
+ See also <option>--preserve-unreferenced</option>, which does
+ something completely different.
</para>
</listitem>
</varlistentry>
@@ -1200,6 +1618,210 @@ outfile.pdf</option>
</listitem>
</varlistentry>
<varlistentry>
+ <term><option>--flatten-annotations=<replaceable>option</replaceable></option></term>
+ <listitem>
+ <para>
+ This option collapses annotations into the pages' contents
+ with special handling for form fields. Ordinarily, an
+ annotation is rendered separately and on top of the page.
+ Combining annotations into the page's contents effectively
+ freezes the placement of the annotations, making them look
+ right after various page transformations. The library
+ functionality backing this option was added for the benefit of
+ programs that want to create <emphasis>n-up</emphasis> page
+ layouts and other similar things that don't work well with
+ annotations. The <replaceable>option</replaceable> parameter
+ may be any of the following:
+ <itemizedlist>
+ <listitem>
+ <para>
+ <option>all</option>: include all annotations that are not
+ marked invisible or hidden
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <option>print</option>: only include annotations that
+ indicate that they should appear when the page is printed
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <option>screen</option>: omit annotations that indicate
+ they should not appear on the screen
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <para>
+ Note that form fields are special because the annotations that
+ are used to render filled-in form fields may become out of
+ date from the fields' values if the form is filled in by a
+ program that doesn't know how to update the appearances. If
+ qpdf detects this case, its default behavior is not to flatten
+ those annotations because doing so would cause the value of
+ the form field to be lost. This gives you a chance to go back
+ and resave the form with a program that knows how to generate
+ appearances. QPDF itself can generate appearances with some
+ limitations. See the <option>--generate-appearances</option>
+ option below.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--generate-appearances</option></term>
+ <listitem>
+ <para>
+ If a file contains interactive form fields and indicates that
+ the appearances are out of date with the values of the form,
+ this flag will regenerate appearances, subject to a few
+ limitations. Note that there is not usually a reason to do
+ this, but it can be necessary before using the
+ <option>--flatten-annotations</option> option. Most of these
+ are not a problem with well-behaved PDF files. The limitations
+ are as follows:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Radio button and checkbox appearances use the pre-set
+ values in the PDF file. QPDF just makes sure that the
+ correct appearance is displayed based on the value of the
+ field. This is fine for PDF files that create their forms
+ properly. Some PDF writers save appearances for fields when
+ they change, which could cause some controls to have
+ inconsistent appearances.
+ </para>
+ </listitem>
+ </itemizedlist>
+ <itemizedlist>
+ <listitem>
+ <para>
+ For text fields and list boxes, any characters that fall
+ outside of US-ASCII or, if detected, &ldquo;Windows
+ ANSI&rdquo; or &ldquo;Mac Roman&rdquo; encoding, will be
+ replaced by the <literal>?</literal> character.
+ </para>
+ </listitem>
+ </itemizedlist>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Quadding is ignored. Quadding is used to specify whether
+ the contents of a field should be left, center, or right
+ aligned with the field.
+ </para>
+ </listitem>
+ </itemizedlist>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Rich text, multi-line, and other more elaborate formatting
+ directives are ignored.
+ </para>
+ </listitem>
+ </itemizedlist>
+ <itemizedlist>
+ <listitem>
+ <para>
+ There is no support for multi-select fields or signature
+ fields.
+ </para>
+ </listitem>
+ </itemizedlist>
+ If qpdf doesn't do a good enough job with your form, use an
+ external application to save your filled-in form before
+ processing it with qpdf.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--optimize-images</option></term>
+ <listitem>
+ <para>
+ This flag causes qpdf to recompress all images that are not
+ compressed with DCT (JPEG) using DCT compression as long as
+ doing so decreases the size in bytes of the image data and the
+ image does not fall below minimum specified dimensions. Useful
+ information is provided when used in combination with
+ <option>--verbose</option>. See also the
+ <option>--oi-min-width</option>,
+ <option>--oi-min-height</option>, and
+ <option>--oi-min-area</option> options. By default, starting
+ in qpdf 8.4, inline images are converted to regular images
+ and optimized as well. Use
+ <option>--keep-inline-images</option> to prevent inline images
+ from being included.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--oi-min-width=<replaceable>width</replaceable></option></term>
+ <listitem>
+ <para>
+ Avoid optimizing images whose width is below the specified
+ amount. If omitted, the default is 128 pixels. Use 0 for no
+ minimum.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--oi-min-height=<replaceable>height</replaceable></option></term>
+ <listitem>
+ <para>
+ Avoid optimizing images whose height is below the specified
+ amount. If omitted, the default is 128 pixels. Use 0 for no
+ minimum.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--oi-min-area=<replaceable>area-in-pixels</replaceable></option></term>
+ <listitem>
+ <para>
+ Avoid optimizing images whose pixel count
+ (width&nbsp;×&nbsp;height) is below the specified amount. If
+ omitted, the default is 16,384 pixels. Use 0 for no minimum.
+ </para>
+ </listitem>
+ </varlistentry>
+
+
+
+ <varlistentry>
+ <term><option>--externalize-inline-images</option></term>
+ <listitem>
+ <para>
+ Convert inline images to regular images. By default, images
+ whose data is at least 1,024 bytes are converted when this
+ option is selected. Use <option>--ii-min-bytes</option> to
+ change the size threshold. This option is implicitly selected
+ when <option>--optimize-images</option> is selected. Use
+ <option>--keep-inline-images</option> to exclude inline images
+ from image optimization.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--ii-min-bytes=<replaceable>bytes</replaceable></option></term>
+ <listitem>
+ <para>
+ Avoid converting inline images whose size is below the
+ specified minimum size to regular images. If omitted, the
+ default is 1,024 bytes. Use 0 for no minimum.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--keep-inline-images</option></term>
+ <listitem>
+ <para>
+ Prevent inline images from being included in image
+ optimization. This option has no affect when
+ <option>--optimize-images</option> is not specified.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><option>--qdf</option></term>
<listitem>
<para>
@@ -1468,7 +2090,7 @@ outfile.pdf</option>
</listitem>
</varlistentry>
<varlistentry>
- <term><option>--show-object=obj[,gen]</option></term>
+ <term><option>--show-object=trailer|obj[,gen]</option></term>
<listitem>
<para>
Show the contents of the given object. This is especially
@@ -1534,6 +2156,44 @@ outfile.pdf</option>
</listitem>
</varlistentry>
<varlistentry>
+ <term><option>--json</option></term>
+ <listitem>
+ <para>
+ Generate a JSON representation of the file. This is described
+ in depth in <xref linkend="ref.json"/>
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--json-help</option></term>
+ <listitem>
+ <para>
+ Describe the format of the JSON output.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--json-key=key</option></term>
+ <listitem>
+ <para>
+ This option is repeatable. If specified, only top-level keys
+ specified will be included in the JSON output. If not
+ specified, all keys will be shown.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><option>--json-object=trailer|obj[,gen]</option></term>
+ <listitem>
+ <para>
+ This option is repeatable. If specified, only specified
+ objects will be shown in the
+ &ldquo;<literal>objects</literal>&rdquo; key of the JSON
+ output. If absent, all objects will be shown.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term><option>--check</option></term>
<listitem>
<para>
@@ -1576,6 +2236,121 @@ outfile.pdf</option>
content stream, in which case it will produce unusable results.
</para>
</sect1>
+ <sect1 id="ref.unicode-passwords">
+ <title>Unicode Passwords</title>
+ <para>
+ At the library API level, all methods that perform encryption and
+ decryption interpret passwords as strings of bytes. It is up to
+ the caller to ensure that they are appropriately encoded. Starting
+ with qpdf version 8.4.0, qpdf will attempt to make this easier for
+ you when interact with qpdf via its command line interface. The
+ PDF specification requires passwords used to encrypt files with
+ 40-bit or 128-bit encryption to be encoded with PDF Doc encoding.
+ This encoding is a single-byte encoding that supports ISO-Latin-1
+ and a handful of other commonly used characters. It has a large
+ overlap with Windows ANSI but is not exactly the same. There is
+ generally not a way to provide PDF Doc encoded strings on the
+ command line. As such, qpdf versions prior to 8.4.0 would often
+ create PDF files that couldn't be opened with other software when
+ given a password with non-ASCII characters to encrypt a file with
+ 40-bit or 128-bit encryption. Starting with qpdf 8.4.0, qpdf
+ recognizes the encoding of the parameter and transcodes it as
+ needed. The rest of this section provides the details about
+ exactly how qpdf behaves. Most users will not need to know this
+ information, but it might be useful if you have been working
+ around qpdf's old behavior or if you are using qpdf to generate
+ encrypted files for testing other PDF software.
+ </para>
+ <para>
+ A note about Windows: when qpdf builds, it attempts to determine
+ what it has to do to use <function>wmain</function> instead of
+ <function>main</function> on Windows. The
+ <function>wmain</function> function is an alternative entry point
+ that receives all arguments as UTF-16-encoded strings. When qpdf
+ starts up this way, it converts all the strings to UTF-8 encoding
+ and then invokes the regular main. This means that, as far as qpdf
+ is concerned, it receives its command-line arguments with UTF-8
+ encoding, just as it would in any modern Linux or UNIX
+ environment.
+ </para>
+ <para>
+ If a file is being encrypted with 40-bit or 128-bit encryption and
+ the supplied password is not a valid UTF-8 string, qpdf will fall
+ back to the behavior of interpreting the password as a string of
+ bytes. If you have old scripts that encrypt files by passing the
+ output of <command>iconv</command> to qpdf, you no longer need to
+ do that, but if you do, qpdf should still work. The only exception
+ would be for the extremely unlikely case of a password that is
+ encoded with a single-byte encoding but also happens to be valid
+ UTF-8. Such a password would contain strings of even numbers of
+ characters that alternate between accented letters and symbols. In
+ the extremely unlikely event that you are intentionally using such
+ passwords and qpdf is thwarting you by interpreting them as UTF-8,
+ you can use <option>--password-mode=bytes</option> to suppress
+ qpdf's automatic behavior.
+ </para>
+ <para>
+ The <option>--password-mode</option> option, as described earlier
+ in this chapter, can be used to change qpdf's interpretation of
+ supplied passwords. There are very few reasons to use this option.
+ One would be the unlikely case described in the previous paragraph
+ in which the supplied password happens to be valid UTF-8 but isn't
+ supposed to be UTF-8. Your best bet would be just to provide the
+ password as a valid UTF-8 string, but you could also use
+ <option>--password-mode=bytes</option>. Another reason to use
+ <option>--password-mode=bytes</option> would be to intentionally
+ generate PDF files encrypted with passwords that are not properly
+ encoded. The qpdf test suite does this to generate invalid files
+ for the purpose of testing its password recovery capability. If
+ you were trying to create intentionally incorrect files for a
+ similar purposes, the <option>bytes</option> password mode can
+ enable you to do this.
+ </para>
+ <para>
+ When qpdf attempts to decrypt a file with a password that contains
+ non-ASCII characters, it will generate a list of alternative
+ passwords by attempting to interpret the password as each of a
+ handful of different coding systems and then transcode them to the
+ required format. This helps to compensate for the supplied
+ password being given in the wrong coding system, such as would
+ happen if you used the <command>iconv</command> workaround that
+ was previously needed. It also generates passwords by doing the
+ reverse operation: translating from correct in incorrect encoding
+ of the password. This would enable qpdf to decrypt files using
+ passwords that were improperly encoded by whatever software
+ encrypted the files, including older versions of qpdf invoked
+ without properly encoded passwords. The combination of these two
+ recovery methods should make qpdf transparently open most
+ encrypted files with the password supplied correctly but in the
+ wrong coding system. There are no real downsides to this behavior,
+ but if you don't want qpdf to do this, you can use the
+ <option>--suppress-password-recovery</option> option. One reason
+ to do that is to ensure that you know the exact password that was
+ used to encrypt the file.
+ </para>
+ <para>
+ With these changes, qpdf now generates compliant passwords in most
+ cases. There are still some exceptions. In particular, the PDF
+ specification directs compliant writers to normalize Unicode
+ passwords and to perform certain transformations on passwords with
+ bidirectional text. Implementing this functionality requires using
+ a real Unicode library like ICU. If a client application that uses
+ qpdf wants to do this, the qpdf library will accept the resulting
+ passwords, but qpdf will not perform these transformations itself.
+ It is possible that this will be addressed in a future version of
+ qpdf. The <classname>QPDFWriter</classname> methods that enable
+ encryption on the output file accept passwords as strings of
+ bytes.
+ </para>
+ <para>
+ Please note that the <option>--password-is-hex-key</option> option
+ is unrelated to all this. This flag bypasses the normal process of
+ going from password to encryption string entirely, allowing the
+ raw encryption key to be specified directly. This is useful for
+ forensic purposes or for brute-force recovery of files with
+ unknown passwords.
+ </para>
+ </sect1>
</chapter>
<chapter id="ref.qdf">
<title>QDF Mode</title>
@@ -1730,6 +2505,8 @@ outfile.pdf</option>
</chapter>
<chapter id="ref.using-library">
<title>Using the QPDF Library</title>
+ <sect1 id="ref.using.from-cxx">
+ <title>Using QPDF from C++</title>
<para>
The source tree for the qpdf package has an
<filename>examples</filename> directory that contains a few
@@ -1761,6 +2538,291 @@ outfile.pdf</option>
time. Multiple threads may simultaneously work with different
instances of these and all other QPDF objects.
</para>
+ </sect1>
+ <sect1 id="ref.using.other-languages">
+ <title>Using QPDF from other languages</title>
+ <para>
+ The qpdf library is implemented in C++, which makes it hard to use
+ directly in other languages. There are a few things that can help.
+ </para>
+ <variablelist>
+ <varlistentry>
+ <term>&ldquo;C&rdquo;</term>
+ <listitem>
+ <para>
+ The qpdf library includes a &ldquo;C&rdquo; language interface
+ that provides a subset of the overall capabilities. The header
+ file <filename>qpdf/qpdf-c.h</filename> includes information
+ about its use. As long as you use a C++ linker, you can link C
+ programs with qpdf and use the C API. For languages that can
+ directly load methods from a shared library, the C API can also
+ be useful. People have reported success using the C API from
+ other languages on Windows by directly calling functions in the
+ DLL.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Python</term>
+ <listitem>
+ <para>
+ A Python module called <ulink
+ url="https://pypi.org/project/pikepdf/">pikepdf</ulink>
+ provides a clean and highly functional set of Python bindings
+ to the qpdf library. Using pikepdf, you can work with PDF files
+ in a natural way and combine qpdf's capabilities with other
+ functionality provided by Python's rich standard library and
+ available modules.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Other Languages</term>
+ <listitem>
+ <para>
+ Starting with version 8.3.0, the <command>qpdf</command>
+ command-line tool can produce a JSON representation of the PDF
+ file's non-content data. This can facilitate interacting
+ programmatically with PDF files through qpdf's command line
+ interface. For more information, please see <xref
+ linkend="ref.json"/>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </sect1>
+ </chapter>
+ <chapter id="ref.json">
+ <title>QPDF JSON</title>
+ <sect1 id="ref.json-overview">
+ <title>Overview</title>
+ <para>
+ Beginning with qpdf version 8.3.0, the <command>qpdf</command>
+ command-line program can produce a JSON representation of the
+ non-content data in a PDF file. It includes a dump in JSON format
+ of all objects in the PDF file excluding the content of streams.
+ This JSON representation makes it very easy to look in detail at
+ the structure of a given PDF file, and it also provides a great way
+ to work with PDF files programmatically from the command-line in
+ languages that can't call or link with the qpdf library directly.
+ Note that stream data can be extracted from PDF files using other
+ qpdf command-line options.
+ </para>
+ </sect1>
+ <sect1 id="ref.json-guarantees">
+ <title>JSON Guarantees</title>
+ <para>
+ The qpdf JSON representation includes a JSON serialization of the
+ raw objects in the PDF file as well as some computed information in
+ a more easily extracted format. QPDF provides some guarantees about
+ its JSON format. These guarantees are designed to simplify the
+ experience of a developer working with the JSON format.
+ <variablelist>
+ <varlistentry>
+ <term>Compatibility</term>
+ <listitem>
+ <para>
+ The top-level JSON object output is a dictionary. The JSON
+ output contains various nested dictionaries and arrays. With
+ the exception of dictionaries that are populated by the fields
+ of objects from the file, all instances of a dictionary are
+ guaranteed to have exactly the same keys. Future versions of
+ qpdf are free to add additional keys but not to remove keys or
+ change the type of object that a key points to. The qpdf
+ program validates this guarantee, and in the unlikely event
+ that a bug in qpdf should cause it to generate data that
+ doesn't conform to this rule, it will ask you to file a bug
+ report.
+ </para>
+ <para>
+ The top-level JSON structure contains a
+ &ldquo;<literal>version</literal>&rdquo; key whose value is
+ simple integer. The value of the <literal>version</literal> key
+ will be incremented if a non-compatible change is made. A
+ non-compatible change would be any change that involves removal
+ of a key, a change to the format of data pointed to by a key,
+ or a semantic change that requires a different interpretation
+ of a previously existing key. A strong effort will be made to
+ avoid breaking compatibility.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Documentation</term>
+ <listitem>
+ <para>
+ The <command>qpdf</command> command can be invoked with the
+ <option>--json-help</option> option. This will output a JSON
+ structure that has the same structure as the JSON output that
+ qpdf generates, except that each field in the help output is a
+ description of the corresponding field in the JSON output. The
+ specific guarantees are as follows:
+ <itemizedlist>
+ <listitem>
+ <para>
+ A dictionary in the help output means that the corresponding
+ location in the actual JSON output is also a dictionary with
+ exactly the same keys; that is, no keys present in help are
+ absent in the real output, and no keys will be present in
+ the real output that are not in help.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A string in the help output is a description of the item
+ that appears in the corresponding location of the actual
+ output. The corresponding output can have any format.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ An array in the help output always contains a single
+ element. It indicates that the corresponding location in the
+ actual output is also an array, and that each element of the
+ array has whatever format is implied by the single element
+ of the help output's array.
+ </para>
+ </listitem>
+ </itemizedlist>
+ For example, the help output indicates includes a
+ &ldquo;<literal>pagelabels</literal>&rdquo; key whose value is
+ an array of one element. That element is a dictionary with keys
+ &ldquo;<literal>index</literal>&rdquo; and
+ &ldquo;<literal>label</literal>&rdquo;. In addition to
+ describing the meaning of those keys, this tells you that the
+ actual JSON output will contain a <literal>pagelabels</literal>
+ array, each of whose elements is a dictionary that contains an
+ <literal>index</literal> key, a <literal>label</literal> key,
+ and no other keys.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>Directness and Simplicity</term>
+ <listitem>
+ <para>
+ The JSON output contains the value of every object in the file,
+ but it also contains some processed data. This is analogous to
+ how qpdf's library interface works. The processed data is
+ similar to the helper functions in that it allows you to look
+ at certain aspects of the PDF file without having to understand
+ all the nuances of the PDF specification, while the raw objects
+ allow you to mine the PDF for anything that the higher-level
+ interfaces are lacking.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </sect1>
+ <sect1 id="json.limitations">
+ <title>Limitations of JSON Representation</title>
+ <para>
+ There are a few limitations to be aware of with the JSON structure:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Strings, names, and indirect object references in the original
+ PDF file are all converted to strings in the JSON
+ representation. In the case of a &ldquo;normal&rdquo; PDF file,
+ you can tell the difference because a name starts with a slash
+ (<literal>/</literal>), and an indirect object reference looks
+ like <literal>n n R</literal>, but if there were to be a string
+ that looked like a name or indirect object reference, there
+ would be no way to tell this from the JSON output. Note that
+ there are certain cases where you know for sure what something
+ is, such as knowing that dictionary keys in objects are always
+ names and that certain things in the higher-level computed data
+ are known to contain indirect object references.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The JSON format doesn't support binary data very well. Mostly
+ the details are not important, but they are presented here for
+ information. When qpdf outputs a string in the JSON
+ representation, it converts the string to UTF-8, assuming usual
+ PDF string semantics. Specifically, if the original string is
+ UTF-16, it is converted to UTF-8. Otherwise, it is assumed to
+ have PDF doc encoding, and is converted to UTF-8 with that
+ assumption. This causes strange things to happen to binary
+ strings. For example, if you had the binary string
+ <literal>&lt;038051&gt;</literal>, this would be output to the
+ JSON as <literal>\u0003•Q</literal> because
+ <literal>03</literal> is not a printable character and
+ <literal>80</literal> is the bullet character in PDF doc
+ encoding and is mapped to the Unicode value
+ <literal>2022</literal>. Since <literal>51</literal> is
+ <literal>Q</literal>, it is output as is. If you wanted to
+ convert back from here to a binary string, would have to
+ recognize Unicode values whose code points are higher than
+ <literal>0xFF</literal> and map those back to their
+ corresponding PDF doc encoding characters. There is no way to
+ tell the difference between a Unicode string that was originally
+ encoded as UTF-16 or one that was converted from PDF doc
+ encoding. In other words, it's best if you don't try to use the
+ JSON format to extract binary strings from the PDF file, but if
+ you really had to, it could be done. Note that qpdf's
+ <option>--show-object</option> option does not have this
+ limitation and will reveal the string as encoded in the original
+ file.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect1>
+ <sect1 id="json.considerations">
+ <title>JSON: Special Considerations</title>
+ <para>
+ For the most part, the built-in JSON help tells you everything you
+ need to know about the JSON format, but there are a few
+ non-obvious things to be aware of:
+ <itemizedlist>
+ <listitem>
+ <para>
+ While qpdf guarantees that keys present in the help will be
+ present in the output, those fields may be null or empty if the
+ information is not known or absent in the file. Also, if you
+ specify <option>--json-keys</option>, the keys that are not
+ listed will be excluded entirely except for those that
+ <option>--json-help</option> says are always present.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ In a few places, there are keys with names containing
+ <literal>pageposfrom1</literal>. The values of these keys are
+ null or an integer. If an integer, they point to a page index
+ within the file numbering from 1. Note that JSON indexes from
+ 0, and you would also use 0-based indexing using the API.
+ However, 1-based indexing is easier in this case because the
+ command-line syntax for specifying page ranges is 1-based. If
+ you were going to write a program that looked through the JSON
+ for information about specific pages and then use the
+ command-line to extract those pages, 1-based indexing is
+ easier. Besides, it's more convenient to subtract 1 from a
+ program in a real programming language than it is to add 1 from
+ shell code.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The image information included in the <literal>page</literal>
+ section of the JSON output includes the key
+ &ldquo;<literal>filterable</literal>&rdquo;. Note that the
+ value of this field may depend on the
+ <option>--decode-level</option> that you invoke qpdf with. The
+ JSON output includes a top-level key
+ &ldquo;<literal>parameters</literal>&rdquo; that indicates the
+ decode level used for computing whether a stream was
+ filterable. For example, jpeg images will be shown as not
+ filterable by default, but they will be shown as filterable if
+ you run <command>qpdf --json --decode-level=all</command>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect1>
</chapter>
<chapter id="ref.design">
<title>Design and Library Notes</title>
@@ -3240,6 +4302,830 @@ print "\n";
</para>
<variablelist>
<varlistentry>
+ <term>8.4.0: February 1, 2019</term>
+ <listitem>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Command-line Enhancements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <emphasis>Non-compatible CLI change:</emphasis> The qpdf
+ command-line tool interprets passwords given at the
+ command-line differently from previous releases when the
+ passwords contain non-ASCII characters. In some cases, the
+ behavior differs from previous releases. For a discussion of
+ the current behavior, please see <xref
+ linkend="ref.unicode-passwords"/>. The incompatibilities are
+ as follows:
+ <itemizedlist>
+ <listitem>
+ <para>
+ On Windows, qpdf now receives all command-line options as
+ Unicode strings if it can figure out the appropriate
+ compile/link options. This is enabled at least for MSVC
+ and mingw builds. That means that if non-ASCII strings
+ are passed to the qpdf CLI in Windows, qpdf will now
+ correctly receive them. In the past, they would have
+ either been encoded as Windows code page 1252 (also known
+ as &ldquo;Windows ANSI&rdquo; or as something
+ unintelligible. In almost all cases, qpdf is able to
+ properly interpret Unicode arguments now, whereas in the
+ past, it would almost never interpret them properly. The
+ result is that non-ASCII passwords given to the qpdf CLI
+ on Windows now have a much greater chance of creating PDF
+ files that can be opened by a variety of readers. In the
+ past, usually files encrypted from the Windows CLI using
+ non-ASCII passwords would not be readable by most
+ viewers. Note that the current version of qpdf is able to
+ decrypt files that it previously created using the
+ previously supplied password.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The PDF specification requires passwords to be encoded as
+ UTF-8 for 256-bit encryption and with PDF Doc encoding
+ for 40-bit or 128-bit encryption. Older versions of qpdf
+ left it up to the user to provide passwords with the
+ correct encoding. The qpdf CLI now detects when a
+ password is given with UTF-8 encoding and automatically
+ transcodes it to what the PDF spec requires. While this
+ is almost always the correct behavior, it is possible to
+ override the behavior if there is some reason to do so.
+ This is discussed in more depth in <xref
+ linkend="ref.unicode-passwords"/>.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ New options <option>--externalize-inline-images</option>,
+ <option>--ii-min-bytes</option>, and
+ <option>--keep-inline-images</option> control qpdf's
+ handling of inline images and possible conversion of them to
+ regular images. By default,
+ <option>--optimize-images</option> now also applies to
+ inline images. These options are discussed in <xref
+ linkend="ref.advanced-transformation"/>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add options <option>--overlay</option> and
+ <option>--underlay</option> for overlaying or underlaying
+ pages of other files onto output pages. See <xref
+ linkend="ref.overlay-underlay"/> for details.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ When opening an encrypted file with a password, if the
+ specified password doesn't work and the password contains
+ any non-ASCII characters, qpdf will try a number of
+ alternative passwords to try to compensate for possible
+ character encoding errors. This behavior can be suppressed
+ with the <option>--suppress-password-recovery</option>
+ option. See <xref linkend="ref.unicode-passwords"/> for a
+ full discussion.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add the <option>--password-mode</option> option to fine-tune
+ how qpdf interprets password arguments, especially when they
+ contain non-ASCII characters. See <xref
+ linkend="ref.unicode-passwords"/> for more information.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ In the <option>--pages</option> option, it is now possible
+ to copy the same page more than once from the same file
+ without using the previous workaround of specifying two
+ different paths to the same file.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ In the <option>--pages</option> option, allow use of
+ &ldquo;.&rdquo; as a shortcut for the primary input file.
+ That way, you can do <command>qpdf in.pdf --pages . 1-2 --
+ out.pdf</command> instead of having to repeat
+ <filename>in.pdf</filename> in the command.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ When encrypting with 128-bit and 256-bit encryption, new
+ encryption options <option>--assemble</option>,
+ <option>--annotate</option>, <option>--form</option>, and
+ <option>--modify-other</option> allow more fine-grained
+ granularity in configuring options. Before, the
+ <option>--modify</option> option only configured certain
+ predefined groups of permissions.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Bug Fixes and Enhancements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <emphasis>Potential data-loss bug:</emphasis> Versions of
+ qpdf between 8.1.0 and 8.3.0 had a bug that could cause page
+ splitting and merging operations to drop some font or image
+ resources if the PDF file's internal structure shared these
+ resource lists across pages and if some but not all of the
+ pages in the output did not reference all the fonts and
+ images. Using the
+ <option>--preserve-unreferenced-resources</option> option
+ would work around the incorrect behavior. This bug was the
+ result of a typo in the code and a deficiency in the test
+ suite. The case that triggered the error was known, just not
+ handled properly. This case is now exercised in qpdf's test
+ suite and properly handled.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ When optimizing images, detect and refuse to optimize
+ images that can't be converted to JPEG because of bit depth
+ or color space.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Linearization and page manipulation APIs now detect and
+ recover from files that have duplicate Page objects in the
+ pages tree.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Using older option <option>--stream-data=compress</option>
+ with object streams, object streams and xref streams were
+ not compressed.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ When the tokenizer returns inline image tokens, delimiters
+ following <literal>ID</literal> and <literal>EI</literal>
+ operators are no longer excluded. This makes it possible to
+ reliably extract the actual image data.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Library Enhancements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Add method
+ <function>QPDFPageObjectHelper::externalizeInlineImages</function>
+ to convert inline images to regular images.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add method
+ <function>QUtil::possible_repaired_encodings()</function> to
+ generate a list of strings that represent other ways the
+ given string could have been encoded. This is the method the
+ QPDF CLI uses to generate the strings it tries when
+ recovering incorrectly encoded Unicode passwords.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add new versions of
+ <function>QPDFWriter::setR{3,4,5,6}EncryptionParameters</function>
+ that allow more granular setting of permissions bits. See
+ <filename>QPDFWriter.hh</filename> for details.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add new versions of the transcoders from UTF-8 to
+ single-byte coding systems in <classname>QUtil</classname>
+ that report success or failure rather than just substituting
+ a specified unknown character.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add method <function>QUtil::analyze_encoding()</function> to
+ determine whether a string has high-bit characters and is
+ appears to be UTF-16 or valid UTF-8 encoding.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add new method
+ <function>QPDFPageObjectHelper::shallowCopyPage()</function>
+ to copy a new page that is a &ldquo;shallow copy&rdquo; of a
+ page. The resulting object is an indirect object ready to be
+ passed to
+ <function>QPDFPageDocumentHelper::addPage()</function> for
+ either the original <classname>QPDF</classname> object or a
+ different one. This is what the <command>qpdf</command>
+ command-line tool uses to copy the same page multiple times
+ from the same file during splitting and merging operations.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add method <function>QPDF::getUniqueId()</function>, which
+ returns a unique identifier for the given QPDF object. The
+ identifier will be unique across the life of the
+ application. The returned value can be safely used as a map
+ key.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add method <function>QPDF::setImmediateCopyFrom</function>.
+ This further enhances qpdf's ability to allow a
+ <classname>QPDF</classname> object from which objects are
+ being copied to go out of scope before the destination
+ object is written. If you call this method on a
+ <classname>QPDF</classname> instances, objects copied
+ <emphasis>from</emphasis> this instance will be copied
+ immediately instead of lazily. This option uses more memory
+ but allows the source object to go out of scope before the
+ destination object is written in all cases. See comments in
+ <filename>QPDF.hh</filename> for details.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add method
+ <function>QPDFPageObjectHelper::getAttribute</function> for
+ retrieving an attribute from the page dictionary taking
+ inheritance into consideration, and optionally making a copy
+ if your intention is to modify the attribute.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Fix long-standing limitation of
+ <function>QPDFPageObjectHelper::getPageImages</function> so
+ that it now properly reports images from inherited resources
+ dictionaries, eliminating the need to call
+ <function>QPDFPageDocumentHelper::pushInheritedAttributesToPage</function>
+ in this case.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add method
+ <function>QPDFObjectHandle::getUniqueResourceName</function>
+ for finding an unused name in a resource dictionary.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add method
+ <function>QPDFPageObjectHelper::getFormXObjectForPage</function>
+ for generating a form XObject equivalent to a page. The
+ resulting object can be used in the same file or copied to
+ another file with <function>copyForeignObject</function>.
+ This can be useful for implementing underlay, overlay, n-up,
+ thumbnails, or any other functionality requiring replication
+ of pages in other contexts.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add method
+ <function>QPDFPageObjectHelper::placeFormXObject</function>
+ for generating content stream text that places a given form
+ XObject on a page, centered and fit within a specified
+ rectangle. This method takes care of computing the proper
+ transformation matrix and may optionally compensate for
+ rotation or scaling of the destination page.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Build Improvements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Add new configure option
+ <option>--enable-avoid-windows-handle</option>, which causes
+ the preprocessor symbol
+ <literal>AVOID_WINDOWS_HANDLE</literal> to be defined. When
+ defined, qpdf will avoid referencing the Windows
+ <classname>HANDLE</classname> type, which is disallowed with
+ certain versions of the Windows SDK.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ For Windows builds, attempt to determine what options, if
+ any, have to be passed to the compiler and linker to enable
+ use of <function>wmain</function>. This causes the
+ preprocessor symbol <literal>WINDOWS_WMAIN</literal> to be
+ defined. If you do your own builds with other compilers, you
+ can define this symbol to cause <function>wmain</function>
+ to be used. This is needed to allow the Windows
+ <command>qpdf</command> command to receive Unicode
+ command-line options.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>8.3.0: January 7, 2019</term>
+ <listitem>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Command-line Enhancements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Shell completion: you can now use eval <command>$(qpdf
+ --completion-bash)</command> and eval <command>$(qpdf
+ --completion-zsh)</command> to enable shell completion for
+ bash and zsh.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Page numbers (also known as page labels) are now preserved
+ when merging and splitting files with the
+ <option>--pages</option> and <option>--split-pages</option>
+ options.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Bookmarks are partially preserved when splitting pages with
+ the <option>--split-pages</option> option. Specifically, the
+ outlines dictionary and some supporting metadata are copied
+ into the split files. The result is that all bookmarks from
+ the original file appear, those that point to pages that are
+ preserved work, and those that point to pages that are not
+ preserved don't do anything. This is an interim step toward
+ proper support for bookmarks in splitting and merging
+ operations.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Page collation: add new option <option>--collate</option>.
+ When specified, the semantics of <option>--pages</option>
+ change from concatenation to collation. See <xref
+ linkend="ref.page-selection"/> for examples and discussion.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Generation of information in JSON format, primarily to
+ facilitate use of qpdf from languages other than C++. Add
+ new options <option>--json</option>,
+ <option>--json-key</option>, and
+ <option>--json-object</option> to generate a JSON
+ representation of the PDF file. Run <command>qpdf
+ --json-help</command> to get a description of the JSON
+ format. For more information, see <xref linkend="ref.json"/>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The <option>--generate-appearances</option> flag will cause
+ qpdf to generate appearances for form fields if the PDF file
+ indicates that form field appearances are out of date. This
+ can happen when PDF forms are filled in by a program that
+ doesn't know how to regenerate the appearances of the
+ filled-in fields.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The <option>--flatten-annotations</option> flag can be used
+ to <emphasis>flatten</emphasis> annotations, including form
+ fields. Ordinarily, annotations are drawn separately from
+ the page. Flattening annotations is the process of combining
+ their appearances into the page's contents. You might want
+ to do this if you are going to rotate or combine pages using
+ a tool that doesn't understand about annotations. You may
+ also want to use <option>--generate-appearances</option>
+ when using this flag since annotations for outdated form
+ fields are not flattened as that would cause loss of
+ information.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The <option>--optimize-images</option> flag tells qpdf to
+ recompresses every image using DCT (JPEG) compression as
+ long as the image is not already compressed with lossy
+ compression and recompressing the image reduces its size.
+ The additional options <option>--oi-min-width</option>,
+ <option>--oi-min-height</option>, and
+ <option>--oi-min-area</option> prevent recompression of
+ images whose width, height, or pixel area
+ (width&nbsp;&#xd7;&nbsp;height) are below a specified
+ threshold.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The <option>--show-object</option> option can now be given
+ as <option>--show-object=trailer</option> to show the
+ trailer dictionary.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Bug Fixes and Enhancements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ QPDF now automatically detects and recovers from dangling
+ references. If a PDF file contained an indirect reference to
+ a non-existent object, which is valid, when adding a new
+ object to the file, it was possible for the new object to
+ take the object ID of the dangling reference, thereby
+ causing the dangling reference to point to the new object.
+ This case is now prevented.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Fixes to form field setting code: strings are always written
+ in UTF-16 format, and checkboxes and radio buttons are
+ handled properly with respect to synchronization of values
+ and appearance states.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The <function>QPDF::checkLinearization()</function> no
+ longer causes the program to crash when it detects problems
+ with linearization data. Instead, it issues a normal warning
+ or error.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Ordinarily qpdf treats an argument of the form
+ <option>@file</option> to mean that command-line options
+ should be read from <filename>file</filename>. Now, if
+ <filename>file</filename> does not exist but
+ <filename>@file</filename> does, qpdf will treat
+ <filename>@file</filename> as a regular option. This makes
+ it possible to work more easily with PDF files whose names
+ happen to start with the <literal>@</literal> character.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Library Enhancements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Remove the restriction in most cases that the source QPDF
+ object used in a
+ <function>QPDF::copyForeignObject</function> call has to
+ stick around until the destination QPDF is written. The
+ exceptional case is when the source stream gets is data
+ using a QPDFObjectHandle::StreamDataProvider. For a more
+ in-depth discussion, see comments around
+ <function>copyForeignObject</function> in
+ <filename>QPDF.hh</filename>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add new method
+ <function>QPDFWriter::getFinalVersion()</function>, which
+ returns the PDF version that will ultimately be written to
+ the final file. See comments in
+ <filename>QPDFWriter.hh</filename> for some restrictions on
+ its use.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add several methods for transcoding strings to some of the
+ character sets used in PDF files:
+ <function>QUtil::utf8_to_ascii</function>,
+ <function>QUtil::utf8_to_win_ansi</function>,
+ <function>QUtil::utf8_to_mac_roman</function>, and
+ <function>QUtil::utf8_to_utf16</function>. For the
+ single-byte encodings that support only a limited character
+ sets, these methods replace unsupported characters with a
+ specified substitute.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add new methods to
+ <classname>QPDFAnnotationObjectHelper</classname> and
+ <classname>QPDFFormFieldObjectHelper</classname> for
+ querying flags and interpretation of different field types.
+ Define constants in <filename>qpdf/Constants.h</filename> to
+ help with interpretation of flag values.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add new methods
+ <function>QPDFAcroFormDocumentHelper::generateAppearancesIfNeeded</function>
+ and
+ <function>QPDFFormFieldObjectHelper::generateAppearance</function>
+ for generating appearance streams. See discussion in
+ <filename>QPDFFormFieldObjectHelper.hh</filename> for
+ limitations.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add two new helper functions for dealing with resource
+ dictionaries:
+ <function>QPDFObjectHandle::getResourceNames()</function>
+ returns a list of all second-level keys, which correspond to
+ the names of resources, and
+ <function>QPDFObjectHandle::mergeResources()</function>
+ merges two resources dictionaries as long as they have
+ non-conflicting keys. These methods are useful for certain
+ types of objects that resolve resources from multiple places,
+ such as form fields.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add methods
+ <function>QPDFPageDocumentHelper::flattenAnnotations()</function>
+ and
+ <function>QPDFAnnotationObjectHelper::getPageContentForAppearance()</function>
+ for handling low-level details of annotation flattening.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add new helper classes:
+ <classname>QPDFOutlineDocumentHelper</classname>,
+ <classname>QPDFOutlineObjectHelper</classname>,
+ <classname>QPDFPageLabelDocumentHelper</classname>,
+ <classname>QPDFNameTreeObjectHelper</classname>, and
+ <classname>QPDFNumberTreeObjectHelper</classname>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add method <function>QPDFObjectHandle::getJSON()</function>
+ that returns a JSON representation of the object. Call
+ <function>serialize()</function> on the result to convert it
+ to a string.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add a simple JSON serializer. This is not a complete or
+ general-purpose JSON library. It allows assembly and
+ serialization of JSON structures with some restrictions,
+ which are described in the header file. This is the
+ serializer used by qpdf's new JSON representation.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add new <classname>QPDFObjectHandle::Matrix</classname>
+ class along with a few convenience methods for dealing with
+ six-element numerical arrays as matrices.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add new method
+ <function>QPDFObjectHandle::wrapInArray</function>, which returns
+ the object itself if it is an array, or an array containing
+ the object otherwise. This is a common construct in PDF.
+ This method prevents you from having to explicitly test
+ whether something is a single element or an array.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Build Improvements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ It is no longer necessary to run
+ <command>autogen.sh</command> to build from a pristine
+ checkout. Automatically generated files are now committed so
+ that it is possible to build on platforms without autoconf
+ directly from a clean checkout of the repository. The
+ <command>configure</command> script detects if the files are
+ out of date when it also determines that the tools are
+ present to regenerate them.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Pull requests and the master branch are now built
+ automatically in <ulink
+ url="https://dev.azure.com/qpdf/qpdf/_build">Azure
+ Pipelines</ulink>, which is free for open source projects.
+ The build includes Linux, mac, Windows 32-bit and 64-bit
+ with mingw and MSVC, and an AppImage build. Official qpdf
+ releases are now built with Azure Pipelines.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Notes for Packagers
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ A new section has been added to the documentation with notes
+ for packagers. Please see <xref linkend="ref.packaging"/>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The qpdf detects out-of-date automatically generated files.
+ If your packaging system automatically refreshes libtool or
+ autoconf files, it could cause this check to fail. To avoid
+ this problem, pass
+ <option>--disable-check-autofiles</option> to
+ <command>configure</command>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If you would like to have qpdf completion enabled
+ automatically, you can install completion files in the
+ distribution's default location. You can find sample
+ completion files to install in the
+ <filename>completions</filename> directory.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>8.2.1: August 18, 2018</term>
+ <listitem>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Command-line Enhancements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Add
+ <option>--keep-files-open=<replaceable>[yn]</replaceable></option>
+ to override default determination of whether to keep files
+ open when merging. Please see the discussion of
+ <option>--keep-files-open</option> in <xref
+ linkend="ref.basic-options"/> for additional details.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>8.2.0: August 16, 2018</term>
+ <listitem>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Command-line Enhancements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Add <option>--no-warn</option> option to suppress issuing
+ warning messages. If there are any conditions that would
+ have caused warnings to be issued, the exit status is still
+ 3.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Bug Fixes and Optimizations
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Performance fix: optimize page merging operation to avoid
+ unnecessary open/close calls on files being merged. This
+ solves a dramatic slow-down that was observed when merging
+ certain types of files.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Optimize how memory was used for the TIFF predictor,
+ drastically improving performance and memory usage for files
+ containing high-resolution images compressed with Flate
+ using the TIFF predictor.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Bug fix: end of line characters were not properly handled
+ inside strings in some cases.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Bug fix: using <option>--progress</option> on very small
+ files could cause an infinite loop.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ API enhancements
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Add new class <classname>QPDFSystemError</classname>, derived
+ from <classname>std::runtime_error</classname>, which is now
+ thrown by <function>QUtil::throw_system_error</function>.
+ This enables the triggering <classname>errno</classname>
+ value to be retrieved.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Add <function>ClosedFileInputSource::stayOpen</function>
+ method, enabling a
+ <classname>ClosedFileInputSource</classname> to stay open
+ during manually indicated periods of high activity, thus
+ reducing the overhead of frequent open/close operations.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ <listitem>
+ <para>
+ Build Changes
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ For the mingw builds, change the name of the DLL import
+ library from <filename>libqpdf.a</filename> to
+ <filename>libqpdf.dll.a</filename> to more accurately
+ reflect that it is an import library rather than a static
+ library. This potentially clears the way for supporting a
+ static library in the future, though presently, the qpdf
+ Windows build only builds the DLL and executables.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
<term>8.1.0: June 23, 2018</term>
<listitem>
<itemizedlist>
@@ -3872,8 +5758,6 @@ print "\n";
</itemizedlist>
</listitem>
</varlistentry>
- </variablelist>
- <variablelist>
<varlistentry>
<term>6.0.0: November 10, 2015</term>
<listitem>