aboutsummaryrefslogtreecommitdiffstats
path: root/manual/index.rst
diff options
context:
space:
mode:
authorJay Berkenbilt <ejb@ql.org>2021-12-11 22:53:08 +0100
committerJay Berkenbilt <ejb@ql.org>2021-12-12 22:29:13 +0100
commitabb6a16ed16b6137b829bc88a6f2b8b3b6c8cf35 (patch)
tree609bc887e7873129074c3b483ceec85cb6293cad /manual/index.rst
parent22d53f928f3056394b1b82baa1141b47bfe5fb5f (diff)
downloadqpdf-abb6a16ed16b6137b829bc88a6f2b8b3b6c8cf35.tar.zst
Insert output of pandoc as is
pandoc -f docbook -t rst qpdf-manual.xml >| /tmp/a.rst Insert /tmp/a.rst into existing index.rst
Diffstat (limited to 'manual/index.rst')
-rw-r--r--manual/index.rst6324
1 files changed, 6324 insertions, 0 deletions
diff --git a/manual/index.rst b/manual/index.rst
index 95e00f20..f9b72fee 100644
--- a/manual/index.rst
+++ b/manual/index.rst
@@ -6,6 +6,6330 @@ QPDF version |release|
:maxdepth: 2
:caption: Contents:
+.. _acknowledgments:
+
+General Information
+===================
+
+QPDF is a program that does structural, content-preserving
+transformations on PDF files. QPDF's website is located at
+https://qpdf.sourceforge.io/. QPDF's source code is hosted on github at
+https://github.com/qpdf/qpdf.
+
+QPDF is licensed under `the Apache License, Version
+2.0 <http://www.apache.org/licenses/LICENSE-2.0>`__ (the "License").
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+
+Versions of qpdf prior to version 7 were released under the terms of
+`the Artistic License, version
+2.0 <https://opensource.org/licenses/Artistic-2.0>`__. At your option,
+you may continue to consider qpdf to be licensed under those terms. The
+Apache License 2.0 permits everything that the Artistic License 2.0
+permits but is slightly less restrictive. Allowing the Artistic License
+to continue being used is primary to help people who may have to get
+specific approval to use qpdf in their products.
+
+QPDF is intentionally released with a permissive license. However, if
+there is some reason that the licensing terms don't work for your
+requirements, please feel free to contact the copyright holder to make
+other arrangements.
+
+QPDF was originally created in 2001 and modified periodically between
+2001 and 2005 during my employment at `Apex
+CoVantage <http://www.apexcovantage.com>`__. Upon my departure from
+Apex, the company graciously allowed me to take ownership of the
+software and continue maintaining as an open source project, a decision
+for which I am very grateful. I have made considerable enhancements to
+it since that time. I feel fortunate to have worked for people who would
+make such a decision. This work would not have been possible without
+their support.
+
+.. _ref.overview:
+
+What is QPDF?
+=============
+
+QPDF is a program that does structural, content-preserving
+transformations on PDF files. It could have been called something like
+*pdf-to-pdf*. It also provides many useful capabilities to developers of
+PDF-producing software or for people who just want to look at the
+innards of a PDF file to learn more about how they work.
+
+With QPDF, it is possible to copy objects from one PDF file into another
+and to manipulate the list of pages in a PDF file. This makes it
+possible to merge and split PDF files. The QPDF library also makes it
+possible for you to create PDF files from scratch. In this mode, you are
+responsible for supplying all the contents of the file, while the QPDF
+library takes care off all the syntactical representation of the
+objects, creation of cross references tables and, if you use them,
+object streams, encryption, linearization, and other syntactic details.
+You are still responsible for generating PDF content on your own.
+
+QPDF has been designed with very few external dependencies, and it is
+intentionally very lightweight. QPDF is *not* a PDF content creation
+library, a PDF viewer, or a program capable of converting PDF into other
+formats. In particular, QPDF knows nothing about the semantics of PDF
+content streams. If you are looking for something that can do that, you
+should look elsewhere. However, once you have a valid PDF file, QPDF can
+be used to transform that file in ways perhaps your original PDF
+creation can't handle. For example, many programs generate simple PDF
+files but can't password-protect them, web-optimize them, or perform
+other transformations of that type.
+
+.. _ref.installing:
+
+Building and Installing QPDF
+============================
+
+This chapter describes how to build and install qpdf. Please see also
+the @1@filename@1@README.md@2@filename@2@ and
+@1@filename@1@INSTALL@2@filename@2@ files in the source distribution.
+
+.. _ref.prerequisites:
+
+System Requirements
+-------------------
+
+The qpdf package has few external dependencies. In order to build qpdf,
+the following packages are required:
+
+- A C++ compiler that supports C++-14.
+
+- zlib: http://www.zlib.net/
+
+- jpeg: http://www.ijg.org/files/ or https://libjpeg-turbo.org/
+
+- *Recommended but not required:* gnutls: https://www.gnutls.org/ to be
+ able to use the gnutls crypto provider, and/or openssl:
+ https://openssl.org/ to be able to use the openssl crypto provider.
+
+- gnu make 3.81 or newer: http://www.gnu.org/software/make
+
+- perl version 5.8 or newer: http://www.perl.org/; required for running
+ the test suite. Starting with qpdf version 9.1.1, perl is no longer
+ required at runtime.
+
+- GNU diffutils (any version): http://www.gnu.org/software/diffutils/
+ is required to run the test suite. Note that this is the version of
+ diff present on virtually all GNU/Linux systems. This is required
+ because the test suite uses @1@command@1@diff -u@2@command@2@.
+
+Part of qpdf's test suite does comparisons of the contents PDF files by
+converting them images and comparing the images. The image comparison
+tests are disabled by default. Those tests are not required for
+determining correctness of a qpdf build if you have not modified the
+code since the test suite also contains expected output files that are
+compared literally. The image comparison tests provide an extra check to
+make sure that any content transformations don't break the rendering of
+pages. Transformations that affect the content streams themselves are
+off by default and are only provided to help developers look into the
+contents of PDF files. If you are making deep changes to the library
+that cause changes in the contents of the files that qpdf generates,
+then you should enable the image comparison tests. Enable them by
+running @1@command@1@configure@2@command@2@ with the
+@1@option@1@--enable-test-compare-images@2@option@2@ flag. If you enable
+this, the following additional requirements are required by the test
+suite. Note that in no case are these items required to use qpdf.
+
+- libtiff: http://www.remotesensing.org/libtiff/
+
+- GhostScript version 8.60 or newer: http://www.ghostscript.com
+
+If you do not enable this, then you do not need to have tiff and
+ghostscript.
+
+Pre-built documentation is distributed with qpdf, so you should
+generally not need to rebuild the documentation. In order to build the
+documentation from its docbook sources, you need the docbook XML style
+sheets (http://downloads.sourceforge.net/docbook/). To build the PDF
+version of the documentation, you need Apache fop
+(http://xml.apache.org/fop/) version 0.94 or higher.
+
+.. _ref.building:
+
+Build Instructions
+------------------
+
+Building qpdf on UNIX is generally just a matter of running
+
+::
+
+ ./configure
+ make
+
+You can also run @1@command@1@make check@2@command@2@ to run the test
+suite and @1@command@1@make install@2@command@2@ to install. Please run
+@1@command@1@./configure --help@2@command@2@ for options on what can be
+configured. You can also set the value of ``DESTDIR`` during
+installation to install to a temporary location, as is common with many
+open source packages. Please see also the
+@1@filename@1@README.md@2@filename@2@ and
+@1@filename@1@INSTALL@2@filename@2@ files in the source distribution.
+
+Building on Windows is a little bit more complicated. For details,
+please see @1@filename@1@README-windows.md@2@filename@2@ in the source
+distribution. You can also download a binary distribution for Windows.
+There is a port of qpdf to Visual C++ version 6 in the
+@1@filename@1@contrib@2@filename@2@ area generously contributed by Jian
+Ma. This is also discussed in more detail in
+@1@filename@1@README-windows.md@2@filename@2@.
+
+While ``wchar_t`` is part of the C++ standard, qpdf uses it in only one
+place in the public API, and it's just in a helper function. It is
+possible to build qpdf on a system that doesn't have ``wchar_t``, and
+it's also possible to compile a program that uses qpdf on a system
+without ``wchar_t`` as long as you don't call that one method. This is a
+very unusual situation. For a detailed discussion, please see the
+top-level README.md file in qpdf's source distribution.
+
+There are some other things you can do with the build. Although qpdf
+uses @1@application@1@autoconf@2@application@2@, it does not use
+@1@application@1@automake@2@application@2@ but instead uses a
+hand-crafted non-recursive Makefile that requires gnu make. If you're
+really interested, please read the comments in the top-level
+@1@filename@1@Makefile@2@filename@2@.
+
+.. _ref.crypto:
+
+Crypto Providers
+----------------
+
+Starting with qpdf 9.1.0, the qpdf library can be built with multiple
+implementations of providers of cryptographic functions, which we refer
+to as "crypto providers." At the time of writing, a crypto
+implementation must provide MD5 and SHA2 (256, 384, and 512-bit) hashes
+and RC4 and AES256 with and without CBC encryption. In the future, if
+digital signature is added to qpdf, there may be additional requirements
+beyond this.
+
+Starting with qpdf version 9.1.0, the available implementations are
+``native`` and ``gnutls``. In qpdf 10.0.0, ``openssl`` was added.
+Additional implementations may be added if needed. It is also possible
+for a developer to provide their own implementation without modifying
+the qpdf library.
+
+.. _ref.crypto.build:
+
+Build Support For Crypto Providers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When building with qpdf's build system, crypto providers can be enabled
+at build time using various @1@command@1@./configure@2@command@2@
+options. The default behavior is for
+@1@command@1@./configure@2@command@2@ to discover which crypto providers
+can be supported based on available external libraries, to build all
+available crypto providers, and to use an external provider as the
+default over the native one. This behavior can be changed with the
+following flags to @1@command@1@./configure@2@command@2@:
+
+- @1@option@1@--enable-crypto-@1@replaceable@1@x@2@replaceable@2@@2@option@2@
+ (where @1@replaceable@1@x@2@replaceable@2@ is a supported crypto
+ provider): enable the @1@replaceable@1@x@2@replaceable@2@ crypto
+ provider, requiring any external dependencies it needs
+
+- @1@option@1@--disable-crypto-@1@replaceable@1@x@2@replaceable@2@@2@option@2@:
+ disable the @1@replaceable@1@x@2@replaceable@2@ provider, and do not
+ link against its dependencies even if they are available
+
+- @1@option@1@--with-default-crypto=@1@replaceable@1@x@2@replaceable@2@@2@option@2@:
+ make @1@replaceable@1@x@2@replaceable@2@ the default provider even if
+ a higher priority one is available
+
+- @1@option@1@--disable-implicit-crypto@2@option@2@: only build crypto
+ providers that are explicitly requested with an
+ @1@option@1@--enable-crypto-@1@replaceable@1@x@2@replaceable@2@@2@option@2@
+ option
+
+For example, if you want to guarantee that the gnutls crypto provider is
+used and that the native provider is not built, you could run
+@1@command@1@./configure --enable-crypto-gnutls
+--disable-implicit-crypto@2@command@2@.
+
+If you build qpdf using your own build system, in order for qpdf to work
+at all, you need to enable at least one crypto provider. The file
+@1@filename@1@libqpdf/qpdf/qpdf-config.h.in@2@filename@2@ provides
+macros ``DEFAULT_CRYPTO``, whose value must be a string naming the
+default crypto provider, and various symbols starting with
+``USE_CRYPTO_``, at least one of which has to be enabled. Additionally,
+you must compile the source files that implement a crypto provider. To
+get a list of those files, look at
+@1@filename@1@libqpdf/build.mk@2@filename@2@. If you want to omit a
+particular crypto provider, as long as its ``USE_CRYPTO_`` symbol is
+undefined, you can completely ignore the source files that belong to a
+particular crypto provider. Additionally, crypto providers may have
+their own external dependencies that can be omitted if the crypto
+provider is not used. For example, if you are building qpdf yourself and
+are using an environment that does not support gnutls or openssl, you
+can ensure that ``USE_CRYPTO_NATIVE`` is defined, ``USE_CRYPTO_GNUTLS``
+is not defined, and ``DEFAULT_CRYPTO`` is defined to ``"native"``. Then
+you must include the source files used in the native implementation,
+some of which were added or renamed from earlier versions, to your
+build, and you can ignore
+@1@filename@1@QPDFCrypto_gnutls.cc@2@filename@2@. Always consult
+@1@filename@1@libqpdf/build.mk@2@filename@2@ to get the list of source
+files you need to build.
+
+.. _ref.crypto.runtime:
+
+Runtime Crypto Provider Selection
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can use the @1@option@1@--show-crypto@2@option@2@ option to
+@1@command@1@qpdf@2@command@2@ to get a list of available crypto
+providers. The default provider is always listed first, and the rest are
+listed in lexical order. Each crypto provider is listed on a line by
+itself with no other text, enabling the output of this command to be
+used easily in scripts.
+
+You can override which crypto provider is used by setting the
+``QPDF_CRYPTO_PROVIDER`` environment variable. There are few reasons to
+ever do this, but you might want to do it if you were explicitly trying
+to compare behavior of two different crypto providers while testing
+performance or reproducing a bug. It could also be useful for people who
+are implementing their own crypto providers.
+
+.. _ref.crypto.develop:
+
+Crypto Provider Information for Developers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you are writing code that uses libqpdf and you want to force a
+certain crypto provider to be used, you can call the method
+``QPDFCryptoProvider::setDefaultProvider``. The argument is the name of
+a built-in or developer-supplied provider. To add your own crypto
+provider, you have to create a class derived from ``QPDFCryptoImpl`` and
+register it with ``QPDFCryptoProvider``. For additional information, see
+comments in @1@filename@1@include/qpdf/QPDFCryptoImpl.hh@2@filename@2@.
+
+.. _ref.crypto.design:
+
+Crypto Provider Design Notes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This section describes a few bits of rationale for why the crypto
+provider interface was set up the way it was. You don't need to know any
+of this information, but it's provided for the record and in case it's
+interesting.
+
+As a general rule, I want to avoid as much as possible including large
+blocks of code that are conditionally compiled such that, in most
+builds, some code is never built. This is dangerous because it makes it
+very easy for invalid code to creep in unnoticed. As such, I want it to
+be possible to build qpdf with all available crypto providers, and this
+is the way I build qpdf for local development. At the same time, if a
+particular packager feels that it is a security liability for qpdf to
+use crypto functionality from other than a library that gets
+considerable scrutiny for this specific purpose (such as gnutls,
+openssl, or nettle), then I want to give that packager the ability to
+completely disable qpdf's native implementation. Or if someone wants to
+avoid adding a dependency on one of the external crypto providers, I
+don't want the availability of the provider to impose additional
+external dependencies within that environment. Both of these are
+situations that I know to be true for some users of qpdf.
+
+I want registration and selection of crypto providers to be thread-safe,
+and I want it to work deterministically for a developer to provide their
+own crypto provider and be able to set it up as the default. This was
+the primary motivation behind requiring C++-11 as doing so enabled me to
+exploit the guaranteed thread safety of local block static
+initialization. The ``QPDFCryptoProvider`` class uses a singleton
+pattern with thread-safe initialization to create the singleton instance
+of ``QPDFCryptoProvider`` and exposes only static methods in its public
+interface. In this way, if a developer wants to call any
+``QPDFCryptoProvider`` methods, the library guarantees the
+``QPDFCryptoProvider`` is fully initialized and all built-in crypto
+providers are registered. Making ``QPDFCryptoProvider`` actually know
+about all the built-in providers may seem a bit sad at first, but this
+choice makes it extremely clear exactly what the initialization behavior
+is. There's no question about provider implementations automatically
+registering themselves in a nondeterministic order. It also means that
+implementations do not need to know anything about the provider
+interface, which makes them easier to test in isolation. Another
+advantage of this approach is that a developer who wants to develop
+their own crypto provider can do so in complete isolation from the qpdf
+library and, with just two calls, can make qpdf use their provider in
+their application. If they decided to contribute their code, plugging it
+into the qpdf library would require a very small change to qpdf's source
+code.
+
+The decision to make the crypto provider selectable at runtime was one I
+struggled with a little, but I decided to do it for various reasons.
+Allowing an end user to switch crypto providers easily could be very
+useful for reproducing a potential bug. If a user reports a bug that
+some cryptographic thing is broken, I can easily ask that person to try
+with the ``QPDF_CRYPTO_PROVIDER`` variable set to different values. The
+same could apply in the event of a performance problem. This also makes
+it easier for qpdf's own test suite to exercise code with different
+providers without having to make every program that links with qpdf
+aware of the possibility of multiple providers. In qpdf's continuous
+integration environment, the entire test suite is run for each supported
+crypto provider. This is made simple by being able to select the
+provider using an environment variable.
+
+Finally, making crypto providers selectable in this way establish a
+pattern that I may follow again in the future for stream filter
+providers. One could imagine a future enhancement where someone could
+provide their own implementations for basic filters like
+``/FlateDecode`` or for other filters that qpdf doesn't support.
+Implementing the registration functions and internal storage of
+registered providers was also easier using C++-11's functional
+interfaces, which was another reason to require C++-11 at this time.
+
+.. _ref.packaging:
+
+Notes for Packagers
+-------------------
+
+If you are packaging qpdf for an operating system distribution, here are
+some things you may want to keep in mind:
+
+- Starting in qpdf version 9.1.1, qpdf no longer has a runtime
+ dependency on perl. This is because fix-qdf was rewritten in C++.
+ However, qpdf still has a build-time dependency on perl.
+
+- Make sure you are getting the intended behavior with regard to crypto
+ providers. Read `Build Support For Crypto
+ Providers <#ref.crypto.build>`__ for details.
+
+- Passing @1@option@1@--enable-show-failed-test-output@2@option@2@ to
+ @1@command@1@./configure@2@command@2@ will cause any failed test
+ output to be written to the console. This can be very useful for
+ seeing test failures generated by autobuilders where you can't access
+ qtest.log after the fact.
+
+- If qpdf's build environment detects the presence of autoconf and
+ related tools, it will check to ensure that automatically generated
+ files are up-to-date with recorded checksums and fail if it detects a
+ discrepancy. This feature is intended to prevent you from
+ accidentally forgetting to regenerate automatic files after modifying
+ their sources. If your packaging environment automatically refreshes
+ automatic files, it can cause this check to fail. Suppress qpdf's
+ checks by passing @1@option@1@--disable-check-autofiles@2@option@2@
+ to @1@command@1@/.configure@2@command@2@. This is safe since qpdf's
+ @1@command@1@autogen.sh@2@command@2@ just runs autotools in the
+ normal way.
+
+- QPDF's @1@command@1@make install@2@command@2@ does not install
+ completion files by default, but as a packager, it's good if you
+ install them wherever your distribution expects such files to go. You
+ can find completion files to install in the
+ @1@filename@1@completions@2@filename@2@ directory.
+
+- Packagers are encouraged to install the source files from the
+ @1@filename@1@examples@2@filename@2@ directory along with qpdf
+ development packages.
+
+.. _ref.using:
+
+Running QPDF
+============
+
+This chapter describes how to run the qpdf program from the command
+line.
+
+.. _ref.invocation:
+
+Basic Invocation
+----------------
+
+When running qpdf, the basic invocation is as follows:
+
+::
+
+ @1@command@1@qpdf@2@command@2@@1@option@1@ [ @1@replaceable@1@options@2@replaceable@2@ ] { @1@replaceable@1@infilename@2@replaceable@2@ | @1@option@1@--empty@2@option@2@ } [ @1@replaceable@1@page_selection_options@2@replaceable@2@ ] @1@replaceable@1@outfilename@2@replaceable@2@@2@option@2@
+
+This converts PDF file @1@option@1@infilename@2@option@2@ to PDF file
+@1@option@1@outfilename@2@option@2@. The output file is functionally
+identical to the input file but may have been structurally reorganized.
+Also, orphaned objects will be removed from the file. Many
+transformations are available as controlled by the options below. In
+place of @1@option@1@infilename@2@option@2@, the parameter
+@1@option@1@--empty@2@option@2@ may be specified. This causes qpdf to
+use a dummy input file that contains zero pages. The only normal use
+case for using @1@option@1@--empty@2@option@2@ would be if you were
+going to add pages from another source, as discussed in `Page Selection
+Options <#ref.page-selection>`__.
+
+If @1@option@1@@filename@2@option@2@ appears as a word anywhere in the
+command-line, it will be read line by line, and each line will be
+treated as a command-line argument. Leading and trailing whitespace is
+intentionally not removed from lines, which makes it possible to handle
+arguments that start or end with spaces. The @1@option@1@@-@2@option@2@
+option allows arguments to be read from standard input. This allows qpdf
+to be invoked with an arbitrary number of arbitrarily long arguments. It
+is also very useful for avoiding having to pass passwords on the command
+line. Note that the @1@option@1@@filename@2@option@2@ can't appear in
+the middle of an argument, so constructs such as
+@1@option@1@--arg=@option@2@option@2@ will not work. You would have to
+include the argument and its options together in the arguments file.
+
+@1@option@1@outfilename@2@option@2@ does not have to be seekable, even
+when generating linearized files. Specifying "@1@option@1@-@2@option@2@"
+as @1@option@1@outfilename@2@option@2@ means to write to standard
+output. If you want to overwrite the input file with the output, use the
+option @1@option@1@--replace-input@2@option@2@ and omit the output file
+name. You can't specify the same file as both the input and the output.
+If you do this, qpdf will tell you about the
+@1@option@1@--replace-input@2@option@2@ option.
+
+Most options require an output file, but some testing or inspection
+commands do not. These are specifically noted.
+
+.. _ref.exit-status:
+
+Exit Status
+~~~~~~~~~~~
+
+The exit status of @1@command@1@qpdf@2@command@2@ may be interpreted as
+follows:
+
+- ``0``: no errors or warnings were found. The file may still have
+ problems qpdf can't detect. If
+ @1@option@1@--warning-exit-0@2@option@2@ was specified, exit status 0
+ is used even if there are warnings.
+
+- ``2``: errors were found. qpdf was not able to fully process the
+ file.
+
+- ``3``: qpdf encountered problems that it was able to recover from. In
+ some cases, the resulting file may still be damaged. Note that qpdf
+ still exits with status ``3`` if it finds warnings even when
+ @1@option@1@--no-warn@2@option@2@ is specified. With
+ @1@option@1@--warning-exit-0@2@option@2@, warnings without errors
+ exit with status 0 instead of 3.
+
+Note that @1@command@1@qpdf@2@command@2@ never exists with status ``1``.
+If you get an exit status of ``1``, it was something else, like the
+shell not being able to find or execute @1@command@1@qpdf@2@command@2@.
+
+.. _ref.shell-completion:
+
+Shell Completion
+----------------
+
+Starting in qpdf version 8.3.0, qpdf provides its own completion support
+for zsh and bash. You can enable bash completion with @1@command@1@eval
+$(qpdf --completion-bash)@2@command@2@ and zsh completion with
+@1@command@1@eval $(qpdf --completion-zsh)@2@command@2@. If
+@1@command@1@qpdf@2@command@2@ is not in your path, you should invoke it
+above with an absolute path. If you invoke it with a relative path, it
+will warn you, and the completion won't work if you're in a different
+directory.
+
+qpdf will use ``argv[0]`` to figure out where its executable is. This
+may produce unwanted results in some cases, especially if you are trying
+to use completion with copy of qpdf that is built from source. You can
+specify a full path to the qpdf you want to use for completion in the
+``QPDF_EXECUTABLE`` environment variable.
+
+.. _ref.basic-options:
+
+Basic Options
+-------------
+
+The following options are the most common ones and perform commonly
+needed transformations.
+
+@1@option@1@--help@2@option@2@
+ Display command-line invocation help.
+
+@1@option@1@--version@2@option@2@
+ Display the current version of qpdf.
+
+@1@option@1@--copyright@2@option@2@
+ Show detailed copyright information.
+
+@1@option@1@--show-crypto@2@option@2@
+ Show a list of available crypto providers, each on a line by itself.
+ The default provider is always listed first. See `Crypto
+ Providers <#ref.crypto>`__ for more information about crypto
+ providers.
+
+@1@option@1@--completion-bash@2@option@2@
+ Output a completion command you can eval to enable shell completion
+ from bash.
+
+@1@option@1@--completion-zsh@2@option@2@
+ Output a completion command you can eval to enable shell completion
+ from zsh.
+
+@1@option@1@--password=@1@replaceable@1@password@2@replaceable@2@@2@option@2@
+ Specifies a password for accessing encrypted files. To read the
+ password from a file or standard input, you can use
+ @1@option@1@--password-file@2@option@2@, added in qpdf 10.2. Note
+ that you can also use @1@option@1@@filename@2@option@2@ or
+ @1@option@1@@-@2@option@2@ as described above to put the password in
+ a file or pass it via standard input, but you would do so by
+ specifying the entire
+ @1@option@1@--password=@1@replaceable@1@password@2@replaceable@2@@2@option@2@
+ option in the file. Syntax such as
+ @1@option@1@--password=@filename@2@option@2@ won't work since
+ @1@option@1@@filename@2@option@2@ is not recognized in the middle of
+ an argument.
+
+@1@option@1@--password-file=@1@replaceable@1@filename@2@replaceable@2@@2@option@2@
+ Reads the first line from the specified file and uses it as the
+ password for accessing encrypted files.
+ @1@option@1@@1@replaceable@1@filename@2@replaceable@2@@2@option@2@
+ may be ``-`` to read the password from standard input. Note that, in
+ this case, the password is echoed and there is no prompt, so use with
+ caution.
+
+@1@option@1@--is-encrypted@2@option@2@
+ Silently exit with status 0 if the file is encrypted or status 2 if
+ the file is not encrypted. This is useful for shell scripts. Other
+ options are ignored if this is given. This option is mutually
+ exclusive with @1@option@1@--requires-password@2@option@2@. Both this
+ option and @1@option@1@--requires-password@2@option@2@ exit with
+ status 2 for non-encrypted files.
+
+@1@option@1@--requires-password@2@option@2@
+ Silently exit with status 0 if a password (other than as supplied) is
+ required. Exit with status 2 if the file is not encrypted. Exit with
+ status 3 if the file is encrypted but requires no password or the
+ correct password has been supplied. This is useful for shell scripts.
+ Note that any supplied password is used when opening the file. When
+ used with a @1@option@1@--password@2@option@2@ option, this option
+ can be used to check the correctness of the password. In that case,
+ an exit status of 3 means the file works with the supplied password.
+ This option is mutually exclusive with
+ @1@option@1@--is-encrypted@2@option@2@. Both this option and
+ @1@option@1@--is-encrypted@2@option@2@ exit with status 2 for
+ non-encrypted files.
+
+@1@option@1@--verbose@2@option@2@
+ Increase verbosity of output. For now, this just prints some
+ indication of any file that it creates.
+
+@1@option@1@--progress@2@option@2@
+ Indicate progress while writing files.
+
+@1@option@1@--no-warn@2@option@2@
+ Suppress writing of warnings to stderr. If warnings were detected and
+ suppressed, @1@command@1@qpdf@2@command@2@ will still exit with exit
+ code 3. See also @1@option@1@--warning-exit-0@2@option@2@.
+
+@1@option@1@--warning-exit-0@2@option@2@
+ If warnings are found but no errors, exit with exit code 0 instead 3.
+ When combined with @1@option@1@--no-warn@2@option@2@, the effect is
+ for @1@command@1@qpdf@2@command@2@ to completely ignore warnings.
+
+@1@option@1@--linearize@2@option@2@
+ Causes generation of a linearized (web-optimized) output file.
+
+@1@option@1@--replace-input@2@option@2@
+ If specified, the output file name should be omitted. This option
+ tells qpdf to replace the input file with the output. It does this by
+ writing to
+ @1@filename@1@@1@replaceable@1@infilename@2@replaceable@2@.~qpdf-temp#@2@filename@2@
+ and, when done, overwriting the input file with the temporary file.
+ If there were any warnings, the original input is saved as
+ @1@filename@1@@1@replaceable@1@infilename@2@replaceable@2@.~qpdf-orig@2@filename@2@.
+
+@1@option@1@--copy-encryption=file@2@option@2@
+ Encrypt the file using the same encryption parameters, including user
+ and owner password, as the specified file. Use
+ @1@option@1@--encryption-file-password@2@option@2@ to specify a
+ password if one is needed to open this file. Note that copying the
+ encryption parameters from a file also copies the first half of
+ ``/ID`` from the file since this is part of the encryption
+ parameters.
+
+@1@option@1@--encryption-file-password=password@2@option@2@
+ If the file specified with @1@option@1@--copy-encryption@2@option@2@
+ requires a password, specify the password using this option. Note
+ that only one of the user or owner password is required. Both
+ passwords will be preserved since QPDF does not distinguish between
+ the two passwords. It is possible to preserve encryption parameters,
+ including the owner password, from a file even if you don't know the
+ file's owner password.
+
+@1@option@1@--allow-weak-crypto@2@option@2@
+ Starting with version 10.4, qpdf issues warnings when requested to
+ create files using RC4 encryption. This option suppresses those
+ warnings. In future versions of qpdf, qpdf will refuse to create
+ files with weak cryptography when this flag is not given. See `Weak
+ Cryptography <#ref.weak-crypto>`__ for additional details.
+
+@1@option@1@--encrypt options --@2@option@2@
+ Causes generation an encrypted output file. Please see `Encryption
+ Options <#ref.encryption-options>`__ for details on how to specify
+ encryption parameters.
+
+@1@option@1@--decrypt@2@option@2@
+ Removes any encryption on the file. A password must be supplied if
+ the file is password protected.
+
+@1@option@1@--password-is-hex-key@2@option@2@
+ Overrides the usual computation/retrieval of the PDF file's
+ encryption key from user/owner password with an explicit
+ specification of the encryption key. When this option is specified,
+ the argument to the @1@option@1@--password@2@option@2@ option is
+ interpreted as a hexadecimal-encoded key value. This only applies to
+ the password used to open the main input file. It does not apply to
+ other files opened by @1@option@1@--pages@2@option@2@ or other
+ options or to files being written.
+
+ Most users will never have a need for this option, and no standard
+ viewers support this mode of operation, but it can be useful for
+ forensic or investigatory purposes. For example, if a PDF file is
+ encrypted with an unknown password, a brute-force attack using the
+ key directly is sometimes more efficient than one using the password.
+ Also, if a file is heavily damaged, it may be possible to derive the
+ encryption key and recover parts of the file using it directly. To
+ expose the encryption key used by an encrypted file that you can open
+ normally, use the @1@option@1@--show-encryption-key@2@option@2@
+ option.
+
+@1@option@1@--suppress-password-recovery@2@option@2@
+ Ordinarily, qpdf attempts to automatically compensate for passwords
+ specified in the wrong character encoding. This option suppresses
+ that behavior. Under normal conditions, there are no reasons to use
+ this option. See `Unicode Passwords <#ref.unicode-passwords>`__ for a
+ discussion
+
+@1@option@1@--password-mode=@1@replaceable@1@mode@2@replaceable@2@@2@option@2@
+ This option can be used to fine-tune how qpdf interprets Unicode
+ (non-ASCII) password strings passed on the command line. With the
+ exception of the @1@option@1@hex-bytes@2@option@2@ mode, these only
+ apply to passwords provided when encrypting files. The
+ @1@option@1@hex-bytes@2@option@2@ mode also applies to passwords
+ specified for reading files. For additional discussion of the
+ supported password modes and when you might want to use them, see
+ `Unicode Passwords <#ref.unicode-passwords>`__. The following modes
+ are supported:
+
+ - @1@option@1@auto@2@option@2@: Automatically determine whether the
+ specified password is a properly encoded Unicode (UTF-8) string,
+ and transcode it as required by the PDF spec based on the type
+ encryption being applied. On Windows starting with version 8.4.0,
+ and on almost all other modern platforms, incoming passwords will
+ be properly encoded in UTF-8, so this is almost always what you
+ want.
+
+ - @1@option@1@unicode@2@option@2@: Tells qpdf that the incoming
+ password is UTF-8, overriding whatever its automatic detection
+ determines. The only difference between this mode and
+ @1@option@1@auto@2@option@2@ is that qpdf will fail with an error
+ message if the password is not valid UTF-8 instead of falling back
+ to @1@option@1@bytes@2@option@2@ mode with a warning.
+
+ - @1@option@1@bytes@2@option@2@: Interpret the password as a literal
+ byte string. For non-Windows platforms, this is what versions of
+ qpdf prior to 8.4.0 did. For Windows platforms, there is no way to
+ specify strings of binary data on the command line directly, but
+ you can use the @1@option@1@@filename@2@option@2@ option to do it,
+ in which case this option forces qpdf to respect the string of
+ bytes as provided. This option will allow you to encrypt PDF files
+ with passwords that will not be usable by other readers.
+
+ - @1@option@1@hex-bytes@2@option@2@: Interpret the password as a
+ hex-encoded string. This provides a way to pass binary data as a
+ password on all platforms including Windows. As with
+ @1@option@1@bytes@2@option@2@, this option may allow creation of
+ files that can't be opened by other readers. This mode affects
+ qpdf's interpretation of passwords specified for decrypting files
+ as well as for encrypting them. It makes it possible to specify
+ strings that are encoded in some manner other than the system's
+ default encoding.
+
+@1@option@1@--rotate=[+|-]angle[:page-range]@2@option@2@
+ Apply rotation to specified pages. The
+ @1@option@1@page-range@2@option@2@ portion of the option value has
+ the same format as page ranges in `Page Selection
+ Options <#ref.page-selection>`__. If the page range is omitted, the
+ rotation is applied to all pages. The @1@option@1@angle@2@option@2@
+ portion of the parameter may be either 0, 90, 180, or 270. If
+ preceded by @1@option@1@+@2@option@2@ or @1@option@1@-@2@option@2@,
+ the angle is added to or subtracted from the specified pages'
+ original rotations. This is almost always what you want. Otherwise
+ the pages' rotations are set to the exact value, which may cause the
+ appearances of the pages to be inconsistent, especially for scans.
+ For example, the command @1@command@1@qpdf in.pdf out.pdf
+ --rotate=+90:2,4,6 --rotate=180:7-8@2@command@2@ would rotate pages
+ 2, 4, and 6 90 degrees clockwise from their original rotation and
+ force the rotation of pages 7 through 8 to 180 degrees regardless of
+ their original rotation, and the command @1@command@1@qpdf in.pdf
+ out.pdf --rotate=+180@2@command@2@ would rotate all pages by 180
+ degrees.
+
+@1@option@1@--keep-files-open=@1@replaceable@1@[yn]@2@replaceable@2@@2@option@2@
+ This option controls whether qpdf keeps individual files open while
+ merging. Prior to version 8.1.0, qpdf always kept all files open, but
+ this meant that the number of files that could be merged was limited
+ by the operating system's open file limit. Version 8.1.0 opened files
+ as they were referenced and closed them after each read, but this
+ caused a major performance impact. Version 8.2.0 optimized the
+ performance but did so in a way that, for local file systems, there
+ was a small but unavoidable performance hit, but for networked file
+ systems, the performance impact could be very high. Starting with
+ version 8.2.1, the default behavior is that files are kept open if no
+ more than 200 files are specified, but that the behavior can be
+ explicitly overridden with the
+ @1@option@1@--keep-files-open@2@option@2@ flag. If you are merging
+ more than 200 files but less than the operating system's max open
+ files limit, you may want to use
+ @1@option@1@--keep-files-open=y@2@option@2@, especially if working
+ over a networked file system. If you are using a local file system
+ where the overhead is low and you might sometimes merge more than the
+ OS limit's number of files from a script and are not worried about a
+ few seconds additional processing time, you may want to specify
+ @1@option@1@--keep-files-open=n@2@option@2@. The threshold for
+ switching may be changed from the default 200 with the
+ @1@option@1@--keep-files-open-threshold@2@option@2@ option.
+
+@1@option@1@--keep-files-open-threshold=@1@replaceable@1@count@2@replaceable@2@@2@option@2@
+ If specified, overrides the default value of 200 used as the
+ threshold for qpdf deciding whether or not to keep files open. See
+ @1@option@1@--keep-files-open@2@option@2@ for details.
+
+@1@option@1@--pages options --@2@option@2@
+ Select specific pages from one or more input files. See `Page
+ Selection Options <#ref.page-selection>`__ for details on how to do
+ page selection (splitting and merging).
+
+@1@option@1@--collate=@1@replaceable@1@n@2@replaceable@2@@2@option@2@
+ When specified, collate rather than concatenate pages from files
+ specified with @1@option@1@--pages@2@option@2@. With a numeric
+ argument, collate in groups of @1@replaceable@1@n@2@replaceable@2@.
+ The default is 1. See `Page Selection
+ Options <#ref.page-selection>`__ for additional details.
+
+@1@option@1@--flatten-rotation@2@option@2@
+ For each page that is rotated using the ``/Rotate`` key in the page's
+ dictionary, remove the ``/Rotate`` key and implement the identical
+ rotation semantics by modifying the page's contents. This option can
+ be useful to prepare files for buggy PDF applications that don't
+ properly handle rotated pages.
+
+@1@option@1@--split-pages=[n]@2@option@2@
+ Write each group of @1@option@1@n@2@option@2@ pages to a separate
+ output file. If @1@option@1@n@2@option@2@ is not specified, create
+ single pages. Output file names are generated as follows:
+
+ - If the string ``%d`` appears in the output file name, it is
+ replaced with a range of zero-padded page numbers starting from 1.
+
+ - Otherwise, if the output file name ends in
+ @1@filename@1@.pdf@2@filename@2@ (case insensitive), a zero-padded
+ page range, preceded by a dash, is inserted before the file
+ extension.
+
+ - Otherwise, the file name is appended with a zero-padded page range
+ preceded by a dash.
+
+ Page ranges are a single number in the case of single-page groups or
+ two numbers separated by a dash otherwise. For example, if
+ @1@filename@1@infile.pdf@2@filename@2@ has 12 pages
+
+ - @1@command@1@qpdf --split-pages infile.pdf %d-out@2@command@2@
+ would generate files @1@filename@1@01-out@2@filename@2@ through
+ @1@filename@1@12-out@2@filename@2@
+
+ - @1@command@1@qpdf --split-pages=2 infile.pdf
+ outfile.pdf@2@command@2@ would generate files
+ @1@filename@1@outfile-01-02.pdf@2@filename@2@ through
+ @1@filename@1@outfile-11-12.pdf@2@filename@2@
+
+ - @1@command@1@qpdf --split-pages infile.pdf
+ something.else@2@command@2@ would generate files
+ @1@filename@1@something.else-01@2@filename@2@ through
+ @1@filename@1@something.else-12@2@filename@2@
+
+ Note that outlines, threads, and other global features of the
+ original PDF file are not preserved. For each page of output, this
+ option creates an empty PDF and copies a single page from the output
+ into it. If you require the global data, you will have to run
+ @1@command@1@qpdf@2@command@2@ with the
+ @1@option@1@--pages@2@option@2@ option once for each file. Using
+ @1@option@1@--split-pages@2@option@2@ is much faster if you don't
+ require the global data.
+
+@1@option@1@--overlay options --@2@option@2@
+ Overlay pages from another file onto the output pages. See `Overlay
+ and Underlay Options <#ref.overlay-underlay>`__ for details on
+ overlay/underlay.
+
+@1@option@1@--underlay options --@2@option@2@
+ Overlay pages from another file onto the output pages. See `Overlay
+ and Underlay Options <#ref.overlay-underlay>`__ for details on
+ overlay/underlay.
+
+Password-protected files may be opened by specifying a password. By
+default, qpdf will preserve any encryption data associated with a file.
+If @1@option@1@--decrypt@2@option@2@ is specified, qpdf will attempt to
+remove any encryption information. If @1@option@1@--encrypt@2@option@2@
+is specified, qpdf will replace the document's encryption parameters
+with whatever is specified.
+
+Note that qpdf does not obey encryption restrictions already imposed on
+the file. Doing so would be meaningless since qpdf can be used to remove
+encryption from the file entirely. This functionality is not intended to
+be used for bypassing copyright restrictions or other restrictions
+placed on files by their producers.
+
+Prior to 8.4.0, in the case of passwords that contain characters that
+fall outside of 7-bit US-ASCII, qpdf left the burden of supplying
+properly encoded encryption and decryption passwords to the user.
+Starting in qpdf 8.4.0, qpdf does this automatically in most cases. For
+an in-depth discussion, please see `Unicode
+Passwords <#ref.unicode-passwords>`__. Previous versions of this manual
+described workarounds using the @1@command@1@iconv@2@command@2@ command.
+Such workarounds are no longer required or recommended with qpdf 8.4.0.
+However, for backward compatibility, qpdf attempts to detect those
+workarounds and do the right thing in most cases.
+
+.. _ref.encryption-options:
+
+Encryption Options
+------------------
+
+To change the encryption parameters of a file, use the --encrypt flag.
+The syntax is
+
+::
+
+ @1@option@1@--encrypt @1@replaceable@1@user-password@2@replaceable@2@ @1@replaceable@1@owner-password@2@replaceable@2@ @1@replaceable@1@key-length@2@replaceable@2@ [ @1@replaceable@1@restrictions@2@replaceable@2@ ] --@2@option@2@
+
+Note that "@1@option@1@--@2@option@2@" terminates parsing of encryption
+flags and must be present even if no restrictions are present.
+
+Either or both of the user password and the owner password may be empty
+strings. Starting in qpdf 10.2, qpdf defaults to not allowing creation
+of PDF files with a non-empty user password, an empty owner password,
+and a 256-bit key since such files can be opened with no password. If
+you want to create such files, specify the encryption option
+@1@option@1@--allow-insecure@2@option@2@, as described below.
+
+The value for
+@1@option@1@@1@replaceable@1@key-length@2@replaceable@2@@2@option@2@ may
+be 40, 128, or 256. The restriction flags are dependent upon key length.
+When no additional restrictions are given, the default is to be fully
+permissive.
+
+If @1@option@1@@1@replaceable@1@key-length@2@replaceable@2@@2@option@2@
+is 40, the following restriction options are available:
+
+@1@option@1@--print=[yn]@2@option@2@
+ Determines whether or not to allow printing.
+
+@1@option@1@--modify=[yn]@2@option@2@
+ Determines whether or not to allow document modification.
+
+@1@option@1@--extract=[yn]@2@option@2@
+ Determines whether or not to allow text/image extraction.
+
+@1@option@1@--annotate=[yn]@2@option@2@
+ Determines whether or not to allow comments and form fill-in and
+ signing.
+
+If @1@option@1@@1@replaceable@1@key-length@2@replaceable@2@@2@option@2@
+is 128, the following restriction options are available:
+
+@1@option@1@--accessibility=[yn]@2@option@2@
+ Determines whether or not to allow accessibility to visually
+ impaired. The qpdf library disregards this field when AES is used or
+ when 256-bit encryption is used. You should really never disable
+ accessibility, but qpdf lets you do it in case you need to configure
+ a file this way for testing purposes. The PDF spec says that
+ conforming readers should disregard this permission and always allow
+ accessibility.
+
+@1@option@1@--extract=[yn]@2@option@2@
+ Determines whether or not to allow text/graphic extraction.
+
+@1@option@1@--assemble=[yn]@2@option@2@
+ Determines whether document assembly (rotation and reordering of
+ pages) is allowed.
+
+@1@option@1@--annotate=[yn]@2@option@2@
+ Determines whether modifying annotations is allowed. This includes
+ adding comments and filling in form fields. Also allows editing of
+ form fields if @1@option@1@--modify-other=y@2@option@2@ is given.
+
+@1@option@1@--form=[yn]@2@option@2@
+ Determines whether filling form fields is allowed.
+
+@1@option@1@--modify-other=[yn]@2@option@2@
+ Allow all document editing except those controlled separately by the
+ @1@option@1@--assemble@2@option@2@,
+ @1@option@1@--annotate@2@option@2@, and
+ @1@option@1@--form@2@option@2@ options.
+
+@1@option@1@--print=@1@replaceable@1@print-opt@2@replaceable@2@@2@option@2@
+ Controls printing access.
+ @1@option@1@@1@replaceable@1@print-opt@2@replaceable@2@@2@option@2@
+ may be one of the following:
+
+ - @1@option@1@full@2@option@2@: allow full printing
+
+ - @1@option@1@low@2@option@2@: allow low-resolution printing only
+
+ - @1@option@1@none@2@option@2@: disallow printing
+
+@1@option@1@--modify=@1@replaceable@1@modify-opt@2@replaceable@2@@2@option@2@
+ Controls modify access. This way of controlling modify access has
+ less granularity than new options added in qpdf 8.4.
+ @1@option@1@@1@replaceable@1@modify-opt@2@replaceable@2@@2@option@2@
+ may be one of the following:
+
+ - @1@option@1@all@2@option@2@: allow full document modification
+
+ - @1@option@1@annotate@2@option@2@: allow comment authoring, form
+ operations, and document assembly
+
+ - @1@option@1@form@2@option@2@: allow form field fill-in and signing
+ and document assembly
+
+ - @1@option@1@assembly@2@option@2@: allow document assembly only
+
+ - @1@option@1@none@2@option@2@: allow no modifications
+
+ Using the @1@option@1@--modify@2@option@2@ option does not allow you
+ to create certain combinations of permissions such as allowing form
+ filling but not allowing document assembly. Starting with qpdf 8.4,
+ you can either just use the other options to control fields
+ individually, or you can use something like @1@option@1@--modify=form
+ --assembly=n@2@option@2@ to fine tune.
+
+@1@option@1@--cleartext-metadata@2@option@2@
+ If specified, any metadata stream in the document will be left
+ unencrypted even if the rest of the document is encrypted. This also
+ forces the PDF version to be at least 1.5.
+
+@1@option@1@--use-aes=[yn]@2@option@2@
+ If @1@option@1@--use-aes=y@2@option@2@ is specified, AES encryption
+ will be used instead of RC4 encryption. This forces the PDF version
+ to be at least 1.6.
+
+@1@option@1@--allow-insecure@2@option@2@
+ From qpdf 10.2, qpdf defaults to not allowing creation of PDF files
+ where the user password is non-empty, the owner password is empty,
+ and a 256-bit key is in use. Files created in this way are insecure
+ since they can be opened without a password. Users would ordinarily
+ never want to create such files. If you are using qpdf to
+ intentionally created strange files for testing (a definite valid use
+ of qpdf!), this option allows you to create such insecure files.
+
+@1@option@1@--force-V4@2@option@2@
+ Use of this option forces the ``/V`` and ``/R`` parameters in the
+ document's encryption dictionary to be set to the value ``4``. As
+ qpdf will automatically do this when required, there is no reason to
+ ever use this option. It exists primarily for use in testing qpdf
+ itself. This option also forces the PDF version to be at least 1.5.
+
+If @1@option@1@@1@replaceable@1@key-length@2@replaceable@2@@2@option@2@
+is 256, the minimum PDF version is 1.7 with extension level 8, and the
+AES-based encryption format used is the PDF 2.0 encryption method
+supported by Acrobat X. the same options are available as with 128 bits
+with the following exceptions:
+
+@1@option@1@--use-aes@2@option@2@
+ This option is not available with 256-bit keys. AES is always used
+ with 256-bit encryption keys.
+
+@1@option@1@--force-V4@2@option@2@
+ This option is not available with 256 keys.
+
+@1@option@1@--force-R5@2@option@2@
+ If specified, qpdf sets the minimum version to 1.7 at extension level
+ 3 and writes the deprecated encryption format used by Acrobat version
+ IX. This option should not be used in practice to generate PDF files
+ that will be in general use, but it can be useful to generate files
+ if you are trying to test proper support in another application for
+ PDF files encrypted in this way.
+
+The default for each permission option is to be fully permissive.
+
+.. _ref.page-selection:
+
+Page Selection Options
+----------------------
+
+Starting with qpdf 3.0, it is possible to split and merge PDF files by
+selecting pages from one or more input files. Whatever file is given as
+the primary input file is used as the starting point, but its pages are
+replaced with pages as specified.
+
+::
+
+ @1@option@1@--pages @1@replaceable@1@input-file@2@replaceable@2@ [ @1@replaceable@1@--password=password@2@replaceable@2@ ] [ @1@replaceable@1@page-range@2@replaceable@2@ ] [ ... ] --@2@option@2@
+
+Multiple input files may be specified. Each one is given as the name of
+the input file, an optional password (if required to open the file), and
+the range of pages. Note that "@1@option@1@--@2@option@2@" terminates
+parsing of page selection flags.
+
+Starting with qpf 8.4, the special input file name
+"@1@filename@1@.@2@filename@2@" can be used as a shortcut for the
+primary input filename.
+
+For each file that pages should be taken from, specify the file, a
+password needed to open the file (if any), and a page range. The
+password needs to be given only once per file. If any of the input files
+are the same as the primary input file or the file used to copy
+encryption parameters (if specified), you do not need to repeat the
+password here. The same file can be repeated multiple times. If a file
+that is repeated has a password, the password only has to be given the
+first time. All non-page data (info, outlines, page numbers, etc.) are
+taken from the primary input file. To discard these, use
+@1@option@1@--empty@2@option@2@ as the primary input.
+
+Starting with qpdf 5.0.0, it is possible to omit the page range. If qpdf
+sees a value in the place where it expects a page range and that value
+is not a valid range but is a valid file name, qpdf will implicitly use
+the range ``1-z``, meaning that it will include all pages in the file.
+This makes it possible to easily combine all pages in a set of files
+with a command like @1@command@1@qpdf --empty out.pdf --pages \*.pdf
+--@2@command@2@.
+
+The page range is a set of numbers separated by commas, ranges of
+numbers separated dashes, or combinations of those. The character "z"
+represents the last page. A number preceded by an "r" indicates to count
+from the end, so ``r3-r1`` would be the last three pages of the
+document. Pages can appear in any order. Ranges can appear with a high
+number followed by a low number, which causes the pages to appear in
+reverse. Numbers may be repeated in a page range. A page range may be
+optionally appended with ``:even`` or ``:odd`` to indicate only the even
+or odd pages in the given range. Note that even and odd refer to the
+positions within the specified, range, not whether the original number
+is even or odd.
+
+Example page ranges:
+
+- ``1,3,5-9,15-12``: pages 1, 3, 5, 6, 7, 8, 9, 15, 14, 13, and 12 in
+ that order.
+
+- ``z-1``: all pages in the document in reverse
+
+- ``r3-r1``: the last three pages of the document
+
+- ``r1-r3``: the last three pages of the document in reverse order
+
+- ``1-20:even``: even pages from 2 to 20
+
+- ``5,7-9,12:odd``: pages 5, 8, and, 12, which are the pages in odd
+ positions from among the original range, which represents pages 5, 7,
+ 8, 9, and 12.
+
+Starting in qpdf version 8.3, you can specify the
+@1@option@1@--collate@2@option@2@ option. Note that this option is
+specified outside of @1@option@1@--pages ... --@2@option@2@. When
+@1@option@1@--collate@2@option@2@ is specified, it changes the meaning
+of @1@option@1@--pages@2@option@2@ so that the specified files, as
+modified by page ranges, are collated rather than concatenated. For
+example, if you add the files @1@filename@1@odd.pdf@2@filename@2@ and
+@1@filename@1@even.pdf@2@filename@2@ containing odd and even pages of a
+document respectively, you could run @1@command@1@qpdf --collate odd.pdf
+--pages odd.pdf even.pdf -- all.pdf@2@command@2@ to collate the pages.
+This would pick page 1 from odd, page 1 from even, page 2 from odd, page
+2 from even, etc. until all pages have been included. Any number of
+files and page ranges can be specified. If any file has fewer pages,
+that file is just skipped when its pages have all been included. For
+example, if you ran @1@command@1@qpdf --collate --empty --pages a.pdf
+1-5 b.pdf 6-4 c.pdf r1 -- out.pdf@2@command@2@, you would get the
+following pages in this order:
+
+- a.pdf page 1
+
+- b.pdf page 6
+
+- c.pdf last page
+
+- a.pdf page 2
+
+- b.pdf page 5
+
+- a.pdf page 3
+
+- b.pdf page 4
+
+- a.pdf page 4
+
+- a.pdf page 5
+
+Starting in qpdf version 10.2, you may specify a numeric argument to
+@1@option@1@--collate@2@option@2@. With
+@1@option@1@--collate=@1@replaceable@1@n@2@replaceable@2@@2@option@2@,
+pull groups of @1@replaceable@1@n@2@replaceable@2@ pages from each file,
+again, stopping when there are no more pages. For example, if you ran
+@1@command@1@qpdf --collate=2 --empty --pages a.pdf 1-5 b.pdf 6-4 c.pdf
+r1 -- out.pdf@2@command@2@, you would get the following pages in this
+order:
+
+- a.pdf page 1
+
+- a.pdf page 2
+
+- b.pdf page 6
+
+- b.pdf page 5
+
+- c.pdf last page
+
+- a.pdf page 3
+
+- a.pdf page 4
+
+- b.pdf page 4
+
+- a.pdf page 5
+
+Starting in qpdf version 8.3, when you split and merge files, any page
+labels (page numbers) are preserved in the final file. It is expected
+that more document features will be preserved by splitting and merging.
+In the mean time, semantics of splitting and merging vary across
+features. For example, the document's outlines (bookmarks) point to
+actual page objects, so if you select some pages and not others,
+bookmarks that point to pages that are in the output file will work, and
+remaining bookmarks will not work. A future version of
+@1@command@1@qpdf@2@command@2@ may do a better job at handling these
+issues. (Note that the qpdf library already contains all of the APIs
+required in order to implement this in your own application if you need
+it.) In the mean time, you can always use
+@1@option@1@--empty@2@option@2@ as the primary input file to avoid
+copying all of that from the first file. For example, to take pages 1
+through 5 from a @1@filename@1@infile.pdf@2@filename@2@ while preserving
+all metadata associated with that file, you could use
+
+::
+
+ @1@command@1@qpdf@2@command@2@ @1@option@1@infile.pdf --pages . 1-5 -- outfile.pdf@2@option@2@
+
+If you wanted pages 1 through 5 from
+@1@filename@1@infile.pdf@2@filename@2@ but you wanted the rest of the
+metadata to be dropped, you could instead run
+
+::
+
+ @1@command@1@qpdf@2@command@2@ @1@option@1@--empty --pages infile.pdf 1-5 -- outfile.pdf@2@option@2@
+
+If you wanted to take pages 1 through 5 from
+@1@filename@1@file1.pdf@2@filename@2@ and pages 11 through 15 from
+@1@filename@1@file2.pdf@2@filename@2@ in reverse, taking document-level
+metadata from @1@filename@1@file2.pdf@2@filename@2@, you would run
+
+::
+
+ @1@command@1@qpdf@2@command@2@ @1@option@1@file2.pdf --pages file1.pdf 1-5 . 15-11 -- outfile.pdf@2@option@2@
+
+If, for some reason, you wanted to take the first page of an encrypted
+file called @1@filename@1@encrypted.pdf@2@filename@2@ with password
+``pass`` and repeat it twice in an output file, and if you wanted to
+drop document-level metadata but preserve encryption, you would use
+
+::
+
+ @1@command@1@qpdf@2@command@2@ @1@option@1@--empty --copy-encryption=encrypted.pdf --encryption-file-password=pass
+ --pages encrypted.pdf --password=pass 1 ./encrypted.pdf --password=pass 1 --
+ outfile.pdf@2@option@2@
+
+Note that we had to specify the password all three times because giving
+a password as @1@option@1@--encryption-file-password@2@option@2@ doesn't
+count for page selection, and as far as qpdf is concerned,
+@1@filename@1@encrypted.pdf@2@filename@2@ and
+@1@filename@1@./encrypted.pdf@2@filename@2@ are separated files. These
+are all corner cases that most users should hopefully never have to be
+bothered with.
+
+Prior to version 8.4, it was not possible to specify the same page from
+the same file directly more than once, and the workaround of specifying
+the same file in more than one way was required. Version 8.4 removes
+this limitation, but there is still a valid use case. When you specify
+the same page from the same file more than once, qpdf will share objects
+between the pages. If you are going to do further manipulation on the
+file and need the two instances of the same original page to be deep
+copies, then you can specify the file in two different ways. For example
+@1@command@1@qpdf in.pdf --pages . 1 ./in.pdf 1 -- out.pdf@2@command@2@
+would create a file with two copies of the first page of the input, and
+the two copies would share any objects in common. This includes fonts,
+images, and anything else the page references.
+
+.. _ref.overlay-underlay:
+
+Overlay and Underlay Options
+----------------------------
+
+Starting with qpdf 8.4, it is possible to overlay or underlay pages from
+other files onto the output generated by qpdf. Specify overlay or
+underlay as follows:
+
+::
+
+ { @1@option@1@--overlay@2@option@2@ | @1@option@1@--underlay@2@option@2@ } @1@replaceable@1@file@2@replaceable@2@ [ @1@option@1@options@2@option@2@ ] @1@option@1@--@2@option@2@
+
+Overlay and underlay options are processed late, so they can be combined
+with other like merging and will apply to the final output. The
+@1@option@1@--overlay@2@option@2@ and @1@option@1@--underlay@2@option@2@
+options work the same way, except underlay pages are drawn underneath
+the page to which they are applied, possibly obscured by the original
+page, and overlay files are drawn on top of the page to which they are
+applied, possibly obscuring the page. You can combine overlay and
+underlay.
+
+The default behavior of overlay and underlay is that pages are taken
+from the overlay/underlay file in sequence and applied to corresponding
+pages in the output until there are no more output pages. If the overlay
+or underlay file runs out of pages, remaining output pages are left
+alone. This behavior can be modified by options, which are provided
+between the @1@option@1@--overlay@2@option@2@ or
+@1@option@1@--underlay@2@option@2@ flag and the
+@1@option@1@--@2@option@2@ option. The following options are supported:
+
+- @1@option@1@--password=password@2@option@2@: supply a password if the
+ overlay/underlay file is encrypted.
+
+- @1@option@1@--to=page-range@2@option@2@: a range of pages in the same
+ form at described in `Page Selection Options <#ref.page-selection>`__
+ indicates which pages in the output should have the overlay/underlay
+ applied. If not specified, overlay/underlay are applied to all pages.
+
+- @1@option@1@--from=[page-range]@2@option@2@: a range of pages that
+ specifies which pages in the overlay/underlay file will be used for
+ overlay or underlay. If not specified, all pages will be used. This
+ can be explicitly specified to be empty if
+ @1@option@1@--repeat@2@option@2@ is used.
+
+- @1@option@1@--repeat=page-range@2@option@2@: an optional range of
+ pages that specifies which pages in the overlay/underlay file will be
+ repeated after the "from" pages are used up. If you want to repeat a
+ range of pages starting at the beginning, you can explicitly use
+ @1@option@1@--from=@2@option@2@.
+
+Here are some examples.
+
+- @1@command@1@--overlay o.pdf --to=1-5 --from=1-3 --repeat=4
+ --@2@command@2@: overlay the first three pages from file
+ @1@filename@1@o.pdf@2@filename@2@ onto the first three pages of the
+ output, then overlay page 4 from @1@filename@1@o.pdf@2@filename@2@
+ onto pages 4 and 5 of the output. Leave remaining output pages
+ untouched.
+
+- @1@command@1@--underlay footer.pdf --from= --repeat=1,2
+ --@2@command@2@: Underlay page 1 of
+ @1@filename@1@footer.pdf@2@filename@2@ on all odd output pages, and
+ underlay page 2 of @1@filename@1@footer.pdf@2@filename@2@ on all even
+ output pages.
+
+.. _ref.attachments:
+
+Embedded Files/Attachments Options
+----------------------------------
+
+Starting with qpdf 10.2, you can work with file attachments in PDF files
+from the command line. The following options are available:
+
+@1@option@1@--list-attachments@2@option@2@
+ Show the "key" and stream number for embedded files. With
+ @1@option@1@--verbose@2@option@2@, additional information, including
+ preferred file name, description, dates, and more are also displayed.
+ The key is usually but not always equal to the file name, and is
+ needed by some of the other options.
+
+@1@option@1@--show-attachment=@1@replaceable@1@key@2@replaceable@2@@2@option@2@
+ Write the contents of the specified attachment to standard output as
+ binary data. The key should match one of the keys shown by
+ @1@option@1@--list-attachments@2@option@2@. If specified multiple
+ times, only the last attachment will be shown.
+
+@1@option@1@--add-attachment @1@replaceable@1@file@2@replaceable@2@ @1@replaceable@1@options@2@replaceable@2@ --@2@option@2@
+ Add or replace an attachment with the contents of
+ @1@replaceable@1@file@2@replaceable@2@. This may be specified more
+ than once. The following additional options may appear before the
+ ``--`` that ends this option:
+
+ @1@option@1@--key=@1@replaceable@1@key@2@replaceable@2@@2@option@2@
+ The key to use to register the attachment in the embedded files
+ table. Defaults to the last path element of
+ @1@replaceable@1@file@2@replaceable@2@.
+
+ @1@option@1@--filename=@1@replaceable@1@name@2@replaceable@2@@2@option@2@
+ The file name to be used for the attachment. This is what is
+ usually displayed to the user and is the name most graphical PDF
+ viewers will use when saving a file. It defaults to the last path
+ element of @1@replaceable@1@file@2@replaceable@2@.
+
+ @1@option@1@--creationdate=@1@replaceable@1@date@2@replaceable@2@@2@option@2@
+ The attachment's creation date in PDF format; defaults to the
+ current time. The date format is explained below.
+
+ @1@option@1@--moddate=@1@replaceable@1@date@2@replaceable@2@@2@option@2@
+ The attachment's modification date in PDF format; defaults to the
+ current time. The date format is explained below.
+
+ @1@option@1@--mimetype=@1@replaceable@1@type/subtype@2@replaceable@2@@2@option@2@
+ The mime type for the attachment, e.g. ``text/plain`` or
+ ``application/pdf``. Note that the mimetype appears in a field
+ called ``/Subtype`` in the PDF but actually includes the full type
+ and subtype of the mime type.
+
+ @1@option@1@--description=@1@replaceable@1@"text"@2@replaceable@2@@2@option@2@
+ Descriptive text for the attachment, displayed by some PDF
+ viewers.
+
+ @1@option@1@--replace@2@option@2@
+ Indicates that any existing attachment with the same key should be
+ replaced by the new attachment. Otherwise,
+ @1@command@1@qpdf@2@command@2@ gives an error if an attachment
+ with that key is already present.
+
+@1@option@1@--remove-attachment=@1@replaceable@1@key@2@replaceable@2@@2@option@2@
+ Remove the specified attachment. This doesn't only remove the
+ attachment from the embedded files table but also clears out the file
+ specification. That means that any potential internal links to the
+ attachment will be broken. This option may be specified multiple
+ times. Run with @1@option@1@--verbose@2@option@2@ to see status of
+ the removal.
+
+@1@option@1@--copy-attachments-from @1@replaceable@1@file@2@replaceable@2@ @1@replaceable@1@options@2@replaceable@2@ --@2@option@2@
+ Copy attachments from another file. This may be specified more than
+ once. The following additional options may appear before the ``--``
+ that ends this option:
+
+ @1@option@1@--password=@1@replaceable@1@password@2@replaceable@2@@2@option@2@
+ If required, the password needed to open
+ @1@replaceable@1@file@2@replaceable@2@
+
+ @1@option@1@--prefix=@1@replaceable@1@prefix@2@replaceable@2@@2@option@2@
+ Only required if the file from which attachments are being copied
+ has attachments with keys that conflict with attachments already
+ in the file. In this case, the specified prefix will be prepended
+ to each key. This affects only the key in the embedded files
+ table, not the file name. The PDF specification doesn't preclude
+ multiple attachments having the same file name.
+
+When a date is required, the date should conform to the PDF date format
+specification, which is
+``D:``\ @1@replaceable@1@yyyymmddhhmmss<z>@2@replaceable@2@, where
+@1@replaceable@1@<z>@2@replaceable@2@ is either ``Z`` for UTC or a
+timezone offset in the form @1@replaceable@1@-hh'mm'@2@replaceable@2@ or
+@1@replaceable@1@+hh'mm'@2@replaceable@2@. Examples:
+``D:20210207161528-05'00'``, ``D:20210207211528Z``.
+
+.. _ref.advanced-parsing:
+
+Advanced Parsing Options
+------------------------
+
+These options control aspects of how qpdf reads PDF files. Mostly these
+are of use to people who are working with damaged files. There is little
+reason to use these options unless you are trying to solve specific
+problems. The following options are available:
+
+@1@option@1@--suppress-recovery@2@option@2@
+ Prevents qpdf from attempting to recover damaged files.
+
+@1@option@1@--ignore-xref-streams@2@option@2@
+ Tells qpdf to ignore any cross-reference streams.
+
+Ordinarily, qpdf will attempt to recover from certain types of errors in
+PDF files. These include errors in the cross-reference table, certain
+types of object numbering errors, and certain types of stream length
+errors. Sometimes, qpdf may think it has recovered but may not have
+actually recovered, so care should be taken when using this option as
+some data loss is possible. The
+@1@option@1@--suppress-recovery@2@option@2@ option will prevent qpdf
+from attempting recovery. In this case, it will fail on the first error
+that it encounters.
+
+Ordinarily, qpdf reads cross-reference streams when they are present in
+a PDF file. If @1@option@1@--ignore-xref-streams@2@option@2@ is
+specified, qpdf will ignore any cross-reference streams for hybrid PDF
+files. The purpose of hybrid files is to make some content available to
+viewers that are not aware of cross-reference streams. It is almost
+never desirable to ignore them. The only time when you might want to use
+this feature is if you are testing creation of hybrid PDF files and wish
+to see how a PDF consumer that doesn't understand object and
+cross-reference streams would interpret such a file.
+
+.. _ref.advanced-transformation:
+
+Advanced Transformation Options
+-------------------------------
+
+These transformation options control fine points of how qpdf creates the
+output file. Mostly these are of use only to people who are very
+familiar with the PDF file format or who are PDF developers. The
+following options are available:
+
+@1@option@1@--compress-streams=@1@replaceable@1@[yn]@2@replaceable@2@@2@option@2@
+ By default, or with @1@option@1@--compress-streams=y@2@option@2@,
+ qpdf will compress any stream with no other filters applied to it
+ with the ``/FlateDecode`` filter when it writes it. To suppress this
+ behavior and preserve uncompressed streams as uncompressed, use
+ @1@option@1@--compress-streams=n@2@option@2@.
+
+@1@option@1@--decode-level=@1@replaceable@1@option@2@replaceable@2@@2@option@2@
+ Controls which streams qpdf tries to decode. The default is
+ @1@option@1@generalized@2@option@2@. The following options are
+ available:
+
+ - @1@option@1@none@2@option@2@: do not attempt to decode any streams
+
+ - @1@option@1@generalized@2@option@2@: decode streams filtered with
+ supported generalized filters: ``/LZWDecode``, ``/FlateDecode``,
+ ``/ASCII85Decode``, and ``/ASCIIHexDecode``. We define generalized
+ filters as those to be used for general-purpose compression or
+ encoding, as opposed to filters specifically designed for image
+ data. Note that, by default, streams already compressed with
+ ``/FlateDecode`` are not uncompressed and recompressed unless you
+ also specify @1@option@1@--recompress-flate@2@option@2@.
+
+ - @1@option@1@specialized@2@option@2@: in addition to generalized,
+ decode streams with supported non-lossy specialized filters;
+ currently this is just ``/RunLengthDecode``
+
+ - @1@option@1@all@2@option@2@: in addition to generalized and
+ specialized, decode streams with supported lossy filters;
+ currently this is just ``/DCTDecode`` (JPEG)
+
+@1@option@1@--stream-data=@1@replaceable@1@option@2@replaceable@2@@2@option@2@
+ Controls transformation of stream data. This option predates the
+ @1@option@1@--compress-streams@2@option@2@ and
+ @1@option@1@--decode-level@2@option@2@ options. Those options can be
+ used to achieve the same affect with more control. The value of
+ @1@option@1@@1@replaceable@1@option@2@replaceable@2@@2@option@2@ may
+ be one of the following:
+
+ - @1@option@1@compress@2@option@2@: recompress stream data when
+ possible (default); equivalent to
+ @1@option@1@--compress-streams=y@2@option@2@
+ @1@option@1@--decode-level=generalized@2@option@2@. Does not
+ recompress streams already compressed with ``/FlateDecode`` unless
+ @1@option@1@--recompress-flate@2@option@2@ is also specified.
+
+ - @1@option@1@preserve@2@option@2@: leave all stream data as is;
+ equivalent to @1@option@1@--compress-streams=n@2@option@2@
+ @1@option@1@--decode-level=none@2@option@2@
+
+ - @1@option@1@uncompress@2@option@2@: uncompress stream data
+ compressed with generalized filters when possible; equivalent to
+ @1@option@1@--compress-streams=n@2@option@2@
+ @1@option@1@--decode-level=generalized@2@option@2@
+
+@1@option@1@--recompress-flate@2@option@2@
+ By default, streams already compressed with ``/FlateDecode`` are left
+ alone rather than being uncompressed and recompressed. This option
+ causes qpdf to uncompress and recompress the streams. There is a
+ significant performance cost to using this option, but you probably
+ want to use it if you specify
+ @1@option@1@--compression-level@2@option@2@.
+
+@1@option@1@--compression-level=@1@replaceable@1@level@2@replaceable@2@@2@option@2@
+ When writing new streams that are compressed with ``/FlateDecode``,
+ use the specified compression level. The value of
+ @1@option@1@level@2@option@2@ should be a number from 1 to 9 and is
+ passed directly to zlib, which implements deflate compression. Note
+ that qpdf doesn't uncompress and recompress streams by default. To
+ have this option apply to already compressed streams, you should also
+ specify @1@option@1@--recompress-flate@2@option@2@. If your goal is
+ to shrink the size of PDF files, you should also use
+ @1@option@1@--object-streams=generate@2@option@2@.
+
+@1@option@1@--normalize-content=[yn]@2@option@2@
+ Enables or disables normalization of content streams. Content
+ normalization is enabled by default in QDF mode. Please see `QDF
+ Mode <#ref.qdf>`__ for additional discussion of QDF mode.
+
+@1@option@1@--object-streams=@1@replaceable@1@mode@2@replaceable@2@@2@option@2@
+ Controls handling of object streams. The value of
+ @1@option@1@@1@replaceable@1@mode@2@replaceable@2@@2@option@2@ may be
+ one of the following:
+
+ - @1@option@1@preserve@2@option@2@: preserve original object streams
+ (default)
+
+ - @1@option@1@disable@2@option@2@: don't write any object streams
+
+ - @1@option@1@generate@2@option@2@: use object streams wherever
+ possible
+
+@1@option@1@--preserve-unreferenced@2@option@2@
+ Tells qpdf to preserve objects that are not referenced when writing
+ the file. Ordinarily any object that is not referenced in a traversal
+ of the document from the trailer dictionary will be discarded. This
+ may be useful in working with some damaged files or inspecting files
+ with known unreferenced objects.
+
+ This flag is ignored for linearized files and has the effect of
+ causing objects in the new file to be written in order by object ID
+ from the original file. This does not mean that object numbers will
+ be the same since qpdf may create stream lengths as direct or
+ indirect differently from the original file, and the original file
+ may have gaps in its numbering.
+
+ See also @1@option@1@--preserve-unreferenced-resources@2@option@2@,
+ which does something completely different.
+
+@1@option@1@--remove-unreferenced-resources=@1@replaceable@1@option@2@replaceable@2@@2@option@2@
+ The @1@replaceable@1@option@2@replaceable@2@ may be ``auto``,
+ ``yes``, or ``no``. The default is ``auto``.
+
+ Starting with qpdf 8.1, when splitting pages, qpdf is able to attempt
+ to remove images and fonts that are not used by a page even if they
+ are referenced in the page's resources dictionary. When shared
+ resources are in use, this behavior can greatly reduce the file sizes
+ of split pages, but the analysis is very slow. In versions from 8.1
+ through 9.1.1, qpdf did this analysis by default. Starting in qpdf
+ 10.0.0, if ``auto`` is used, qpdf does a quick analysis of the file
+ to determine whether the file is likely to have unreferenced objects
+ on pages, a pattern that frequently occurs when resource dictionaries
+ are shared across multiple pages and rarely occurs otherwise. If it
+ discovers this pattern, then it will attempt to remove unreferenced
+ resources. Usually this means you get the slower splitting speed only
+ when it's actually going to create smaller files. You can suppress
+ removal of unreferenced resources altogether by specifying ``no`` or
+ force it to do the full algorithm by specifying ``yes``.
+
+ Other than cases in which you don't care about file size and care a
+ lot about runtime, there are few reasons to use this option,
+ especially now that ``auto`` mode is supported. One reason to use
+ this is if you suspect that qpdf is removing resources it shouldn't
+ be removing. If you encounter that case, please report it as bug at
+ https://github.com/qpdf/qpdf/issues/.
+
+@1@option@1@--preserve-unreferenced-resources@2@option@2@
+ This is a synonym for
+ @1@option@1@--remove-unreferenced-resources=no@2@option@2@.
+
+ See also @1@option@1@--preserve-unreferenced@2@option@2@, which does
+ something completely different.
+
+@1@option@1@--newline-before-endstream@2@option@2@
+ Tells qpdf to insert a newline before the ``endstream`` keyword, not
+ counted in the length, after any stream content even if the last
+ character of the stream was a newline. This may result in two
+ newlines in some cases. This is a requirement of PDF/A. While qpdf
+ doesn't specifically know how to generate PDF/A-compliant PDFs, this
+ at least prevents it from removing compliance on already compliant
+ files.
+
+@1@option@1@--linearize-pass1=@1@replaceable@1@file@2@replaceable@2@@2@option@2@
+ Write the first pass of linearization to the named file. The
+ resulting file is not a valid PDF file. This option is useful only
+ for debugging ``QPDFWriter``'s linearization code. When qpdf
+ linearizes files, it writes the file in two passes, using the first
+ pass to calculate sizes and offsets that are required for hint tables
+ and the linearization dictionary. Ordinarily, the first pass is
+ discarded. This option enables it to be captured.
+
+@1@option@1@--coalesce-contents@2@option@2@
+ When a page's contents are split across multiple streams, this option
+ causes qpdf to combine them into a single stream. Use of this option
+ is never necessary for ordinary usage, but it can help when working
+ with some files in some cases. For example, this can also be combined
+ with QDF mode or content normalization to make it easier to look at
+ all of a page's contents at once.
+
+@1@option@1@--flatten-annotations=@1@replaceable@1@option@2@replaceable@2@@2@option@2@
+ This option collapses annotations into the pages' contents with
+ special handling for form fields. Ordinarily, an annotation is
+ rendered separately and on top of the page. Combining annotations
+ into the page's contents effectively freezes the placement of the
+ annotations, making them look right after various page
+ transformations. The library functionality backing this option was
+ added for the benefit of programs that want to create *n-up* page
+ layouts and other similar things that don't work well with
+ annotations. The @1@replaceable@1@option@2@replaceable@2@ parameter
+ may be any of the following:
+
+ - @1@option@1@all@2@option@2@: include all annotations that are not
+ marked invisible or hidden
+
+ - @1@option@1@print@2@option@2@: only include annotations that
+ indicate that they should appear when the page is printed
+
+ - @1@option@1@screen@2@option@2@: omit annotations that indicate
+ they should not appear on the screen
+
+ Note that form fields are special because the annotations that are
+ used to render filled-in form fields may become out of date from the
+ fields' values if the form is filled in by a program that doesn't
+ know how to update the appearances. If qpdf detects this case, its
+ default behavior is not to flatten those annotations because doing so
+ would cause the value of the form field to be lost. This gives you a
+ chance to go back and resave the form with a program that knows how
+ to generate appearances. QPDF itself can generate appearances with
+ some limitations. See the
+ @1@option@1@--generate-appearances@2@option@2@ option below.
+
+@1@option@1@--generate-appearances@2@option@2@
+ If a file contains interactive form fields and indicates that the
+ appearances are out of date with the values of the form, this flag
+ will regenerate appearances, subject to a few limitations. Note that
+ there is not usually a reason to do this, but it can be necessary
+ before using the @1@option@1@--flatten-annotations@2@option@2@
+ option. Most of these are not a problem with well-behaved PDF files.
+ The limitations are as follows:
+
+ - Radio button and checkbox appearances use the pre-set values in
+ the PDF file. QPDF just makes sure that the correct appearance is
+ displayed based on the value of the field. This is fine for PDF
+ files that create their forms properly. Some PDF writers save
+ appearances for fields when they change, which could cause some
+ controls to have inconsistent appearances.
+
+ - For text fields and list boxes, any characters that fall outside
+ of US-ASCII or, if detected, "Windows ANSI" or "Mac Roman"
+ encoding, will be replaced by the ``?`` character.
+
+ - Quadding is ignored. Quadding is used to specify whether the
+ contents of a field should be left, center, or right aligned with
+ the field.
+
+ - Rich text, multi-line, and other more elaborate formatting
+ directives are ignored.
+
+ - There is no support for multi-select fields or signature fields.
+
+ If qpdf doesn't do a good enough job with your form, use an external
+ application to save your filled-in form before processing it with
+ qpdf.
+
+@1@option@1@--optimize-images@2@option@2@
+ This flag causes qpdf to recompress all images that are not
+ compressed with DCT (JPEG) using DCT compression as long as doing so
+ decreases the size in bytes of the image data and the image does not
+ fall below minimum specified dimensions. Useful information is
+ provided when used in combination with
+ @1@option@1@--verbose@2@option@2@. See also the
+ @1@option@1@--oi-min-width@2@option@2@,
+ @1@option@1@--oi-min-height@2@option@2@, and
+ @1@option@1@--oi-min-area@2@option@2@ options. By default, starting
+ in qpdf 8.4, inline images are converted to regular images and
+ optimized as well. Use @1@option@1@--keep-inline-images@2@option@2@
+ to prevent inline images from being included.
+
+@1@option@1@--oi-min-width=@1@replaceable@1@width@2@replaceable@2@@2@option@2@
+ Avoid optimizing images whose width is below the specified amount. If
+ omitted, the default is 128 pixels. Use 0 for no minimum.
+
+@1@option@1@--oi-min-height=@1@replaceable@1@height@2@replaceable@2@@2@option@2@
+ Avoid optimizing images whose height is below the specified amount.
+ If omitted, the default is 128 pixels. Use 0 for no minimum.
+
+@1@option@1@--oi-min-area=@1@replaceable@1@area-in-pixels@2@replaceable@2@@2@option@2@
+ Avoid optimizing images whose pixel count (width × height) is below
+ the specified amount. If omitted, the default is 16,384 pixels. Use 0
+ for no minimum.
+
+@1@option@1@--externalize-inline-images@2@option@2@
+ Convert inline images to regular images. By default, images whose
+ data is at least 1,024 bytes are converted when this option is
+ selected. Use @1@option@1@--ii-min-bytes@2@option@2@ to change the
+ size threshold. This option is implicitly selected when
+ @1@option@1@--optimize-images@2@option@2@ is selected. Use
+ @1@option@1@--keep-inline-images@2@option@2@ to exclude inline images
+ from image optimization.
+
+@1@option@1@--ii-min-bytes=@1@replaceable@1@bytes@2@replaceable@2@@2@option@2@
+ Avoid converting inline images whose size is below the specified
+ minimum size to regular images. If omitted, the default is 1,024
+ bytes. Use 0 for no minimum.
+
+@1@option@1@--keep-inline-images@2@option@2@
+ Prevent inline images from being included in image optimization. This
+ option has no affect when @1@option@1@--optimize-images@2@option@2@
+ is not specified.
+
+@1@option@1@--remove-page-labels@2@option@2@
+ Remove page labels from the output file.
+
+@1@option@1@--qdf@2@option@2@
+ Turns on QDF mode. For additional information on QDF, please see `QDF
+ Mode <#ref.qdf>`__. Note that @1@option@1@--linearize@2@option@2@
+ disables QDF mode.
+
+@1@option@1@--min-version=@1@replaceable@1@version@2@replaceable@2@@2@option@2@
+ Forces the PDF version of the output file to be at least
+ @1@replaceable@1@version@2@replaceable@2@. In other words, if the
+ input file has a lower version than the specified version, the
+ specified version will be used. If the input file has a higher
+ version, the input file's original version will be used. It is seldom
+ necessary to use this option since qpdf will automatically increase
+ the version as needed when adding features that require newer PDF
+ readers.
+
+ The version number may be expressed in the form
+ @1@replaceable@1@major.minor.extension-level@2@replaceable@2@, in
+ which case the version is interpreted as
+ @1@replaceable@1@major.minor@2@replaceable@2@ at extension level
+ @1@replaceable@1@extension-level@2@replaceable@2@. For example,
+ version ``1.7.8`` represents version 1.7 at extension level 8. Note
+ that minimal syntax checking is done on the command line.
+
+@1@option@1@--force-version=@1@replaceable@1@version@2@replaceable@2@@2@option@2@
+ This option forces the PDF version to be the exact version specified
+ *even when the file may have content that is not supported in that
+ version*. The version number is interpreted in the same way as with
+ @1@option@1@--min-version@2@option@2@ so that extension levels can be
+ set. In some cases, forcing the output file's PDF version to be lower
+ than that of the input file will cause qpdf to disable certain
+ features of the document. Specifically, 256-bit keys are disabled if
+ the version is less than 1.7 with extension level 8 (except R5 is
+ disabled if less than 1.7 with extension level 3), AES encryption is
+ disabled if the version is less than 1.6, cleartext metadata and
+ object streams are disabled if less than 1.5, 128-bit encryption keys
+ are disabled if less than 1.4, and all encryption is disabled if less
+ than 1.3. Even with these precautions, qpdf won't be able to do
+ things like eliminate use of newer image compression schemes,
+ transparency groups, or other features that may have been added in
+ more recent versions of PDF.
+
+ As a general rule, with the exception of big structural things like
+ the use of object streams or AES encryption, PDF viewers are supposed
+ to ignore features in files that they don't support from newer
+ versions. This means that forcing the version to a lower version may
+ make it possible to open your PDF file with an older version, though
+ bear in mind that some of the original document's functionality may
+ be lost.
+
+By default, when a stream is encoded using non-lossy filters that qpdf
+understands and is not already compressed using a good compression
+scheme, qpdf will uncompress and recompress streams. Assuming proper
+filter implements, this is safe and generally results in smaller files.
+This behavior may also be explicitly requested with
+@1@option@1@--stream-data=compress@2@option@2@.
+
+When @1@option@1@--normalize-content=y@2@option@2@ is specified, qpdf
+will attempt to normalize whitespace and newlines in page content
+streams. This is generally safe but could, in some cases, cause damage
+to the content streams. This option is intended for people who wish to
+study PDF content streams or to debug PDF content. You should not use
+this for "production" PDF files.
+
+When normalizing content, if qpdf runs into any lexical errors, it will
+print a warning indicating that content may be damaged. The only
+situation in which qpdf is known to cause damage during content
+normalization is when a page's contents are split across multiple
+streams and streams are split in the middle of a lexical token such as a
+string, name, or inline image. Note that files that do this are invalid
+since the PDF specification states that content streams are not to be
+split in the middle of a token. If you want to inspect the original
+content streams in an uncompressed format, you can always run with
+@1@option@1@--qdf --normalize-content=n@2@option@2@ for a QDF file
+without content normalization, or alternatively
+@1@option@1@--stream-data=uncompress@2@option@2@ for a regular non-QDF
+mode file with uncompressed streams. These will both uncompress all the
+streams but will not attempt to normalize content. Please note that if
+you are using content normalization or QDF mode for the purpose of
+manually inspecting files, you don't have to care about this.
+
+Object streams, also known as compressed objects, were introduced into
+the PDF specification at version 1.5, corresponding to Acrobat 6. Some
+older PDF viewers may not support files with object streams. qpdf can be
+used to transform files with object streams to files without object
+streams or vice versa. As mentioned above, there are three object stream
+modes: @1@option@1@preserve@2@option@2@,
+@1@option@1@disable@2@option@2@, and @1@option@1@generate@2@option@2@.
+
+In @1@option@1@preserve@2@option@2@ mode, the relationship to objects
+and the streams that contain them is preserved from the original file.
+In @1@option@1@disable@2@option@2@ mode, all objects are written as
+regular, uncompressed objects. The resulting file should be readable by
+older PDF viewers. (Of course, the content of the files may include
+features not supported by older viewers, but at least the structure will
+be supported.) In @1@option@1@generate@2@option@2@ mode, qpdf will
+create its own object streams. This will usually result in more compact
+PDF files, though they may not be readable by older viewers. In this
+mode, qpdf will also make sure the PDF version number in the header is
+at least 1.5.
+
+The @1@option@1@--qdf@2@option@2@ flag turns on QDF mode, which changes
+some of the defaults described above. Specifically, in QDF mode, by
+default, stream data is uncompressed, content streams are normalized,
+and encryption is removed. These defaults can still be overridden by
+specifying the appropriate options as described above. Additionally, in
+QDF mode, stream lengths are stored as indirect objects, objects are
+laid out in a less efficient but more readable fashion, and the
+documents are interspersed with comments that make it easier for the
+user to find things and also make it possible for
+@1@command@1@fix-qdf@2@command@2@ to work properly. QDF mode is intended
+for people, mostly developers, who wish to inspect or modify PDF files
+in a text editor. For details, please see `QDF Mode <#ref.qdf>`__.
+
+.. _ref.testing-options:
+
+Testing, Inspection, and Debugging Options
+------------------------------------------
+
+These options can be useful for digging into PDF files or for use in
+automated test suites for software that uses the qpdf library. When any
+of the options in this section are specified, no output file should be
+given. The following options are available:
+
+@1@option@1@--deterministic-id@2@option@2@
+ Causes generation of a deterministic value for /ID. This prevents use
+ of timestamp and output file name information in the /ID generation.
+ Instead, at some slight additional runtime cost, the /ID field is
+ generated to include a digest of the significant parts of the content
+ of the output PDF file. This means that a given qpdf operation should
+ generate the same /ID each time it is run, which can be useful when
+ caching results or for generation of some test data. Use of this flag
+ is not compatible with creation of encrypted files.
+
+@1@option@1@--static-id@2@option@2@
+ Causes generation of a fixed value for /ID. This is intended for
+ testing only. Never use it for production files. If you are trying to
+ get the same /ID each time for a given file and you are not
+ generating encrypted files, consider using the
+ @1@option@1@--deterministic-id@2@option@2@ option.
+
+@1@option@1@--static-aes-iv@2@option@2@
+ Causes use of a static initialization vector for AES-CBC. This is
+ intended for testing only so that output files can be reproducible.
+ Never use it for production files. This option in particular is not
+ secure since it significantly weakens the encryption.
+
+@1@option@1@--no-original-object-ids@2@option@2@
+ Suppresses inclusion of original object ID comments in QDF files.
+ This can be useful when generating QDF files for test purposes,
+ particularly when comparing them to determine whether two PDF files
+ have identical content.
+
+@1@option@1@--show-encryption@2@option@2@
+ Shows document encryption parameters. Also shows the document's user
+ password if the owner password is given.
+
+@1@option@1@--show-encryption-key@2@option@2@
+ When encryption information is being displayed, as when
+ @1@option@1@--check@2@option@2@ or
+ @1@option@1@--show-encryption@2@option@2@ is given, display the
+ computed or retrieved encryption key as a hexadecimal string. This
+ value is not ordinarily useful to users, but it can be used as the
+ argument to @1@option@1@--password@2@option@2@ if the
+ @1@option@1@--password-is-hex-key@2@option@2@ is specified. Note
+ that, when PDF files are encrypted, passwords and other metadata are
+ used only to compute an encryption key, and the encryption key is
+ what is actually used for encryption. This enables retrieval of that
+ key.
+
+@1@option@1@--check-linearization@2@option@2@
+ Checks file integrity and linearization status.
+
+@1@option@1@--show-linearization@2@option@2@
+ Checks and displays all data in the linearization hint tables.
+
+@1@option@1@--show-xref@2@option@2@
+ Shows the contents of the cross-reference table in a human-readable
+ form. This is especially useful for files with cross-reference
+ streams which are stored in a binary format.
+
+@1@option@1@--show-object=trailer|obj[,gen]@2@option@2@
+ Show the contents of the given object. This is especially useful for
+ inspecting objects that are inside of object streams (also known as
+ "compressed objects").
+
+@1@option@1@--raw-stream-data@2@option@2@
+ When used along with the @1@option@1@--show-object@2@option@2@
+ option, if the object is a stream, shows the raw stream data instead
+ of object's contents.
+
+@1@option@1@--filtered-stream-data@2@option@2@
+ When used along with the @1@option@1@--show-object@2@option@2@
+ option, if the object is a stream, shows the filtered stream data
+ instead of object's contents. If the stream is filtered using filters
+ that qpdf does not support, an error will be issued.
+
+@1@option@1@--show-npages@2@option@2@
+ Prints the number of pages in the input file on a line by itself.
+ Since the number of pages appears by itself on a line, this option
+ can be useful for scripting if you need to know the number of pages
+ in a file.
+
+@1@option@1@--show-pages@2@option@2@
+ Shows the object and generation number for each page dictionary
+ object and for each content stream associated with the page. Having
+ this information makes it more convenient to inspect objects from a
+ particular page.
+
+@1@option@1@--with-images@2@option@2@
+ When used along with @1@option@1@--show-pages@2@option@2@, also shows
+ the object and generation numbers for the image objects on each page.
+ (At present, information about images in shared resource dictionaries
+ are not output by this command. This is discussed in a comment in the
+ source code.)
+
+@1@option@1@--json@2@option@2@
+ Generate a JSON representation of the file. This is described in
+ depth in `QPDF JSON <#ref.json>`__
+
+@1@option@1@--json-help@2@option@2@
+ Describe the format of the JSON output.
+
+@1@option@1@--json-key=key@2@option@2@
+ This option is repeatable. If specified, only top-level keys
+ specified will be included in the JSON output. If not specified, all
+ keys will be shown.
+
+@1@option@1@--json-object=trailer|obj[,gen]@2@option@2@
+ This option is repeatable. If specified, only specified objects will
+ be shown in the "``objects``" key of the JSON output. If absent, all
+ objects will be shown.
+
+@1@option@1@--check@2@option@2@
+ Checks file structure and well as encryption, linearization, and
+ encoding of stream data. A file for which
+ @1@option@1@--check@2@option@2@ reports no errors may still have
+ errors in stream data content but should otherwise be structurally
+ sound. If @1@option@1@--check@2@option@2@ any errors, qpdf will exit
+ with a status of 2. There are some recoverable conditions that
+ @1@option@1@--check@2@option@2@ detects. These are issued as warnings
+ instead of errors. If qpdf finds no errors but finds warnings, it
+ will exit with a status of 3 (as of version 2.0.4). When
+ @1@option@1@--check@2@option@2@ is combined with other options,
+ checks are always performed before any other options are processed.
+ For erroneous files, @1@option@1@--check@2@option@2@ will cause qpdf
+ to attempt to recover, after which other options are effectively
+ operating on the recovered file. Combining
+ @1@option@1@--check@2@option@2@ with other options in this way can be
+ useful for manually recovering severely damaged files. Note that
+ @1@option@1@--check@2@option@2@ produces no output to standard output
+ when everything is valid, so if you are using this to
+ programmatically validate files in bulk, it is safe to run without
+ output redirected to @1@filename@1@/dev/null@2@filename@2@ and just
+ check for a 0 exit code.
+
+The @1@option@1@--raw-stream-data@2@option@2@ and
+@1@option@1@--filtered-stream-data@2@option@2@ options are ignored
+unless @1@option@1@--show-object@2@option@2@ is given. Either of these
+options will cause the stream data to be written to standard output. In
+order to avoid commingling of stream data with other output, it is
+recommend that these objects not be combined with other test/inspection
+options.
+
+If @1@option@1@--filtered-stream-data@2@option@2@ is given and
+@1@option@1@--normalize-content=y@2@option@2@ is also given, qpdf will
+attempt to normalize the stream data as if it is a page content stream.
+This attempt will be made even if it is not a page content stream, in
+which case it will produce unusable results.
+
+.. _ref.unicode-passwords:
+
+Unicode Passwords
+-----------------
+
+At the library API level, all methods that perform encryption and
+decryption interpret passwords as strings of bytes. It is up to the
+caller to ensure that they are appropriately encoded. Starting with qpdf
+version 8.4.0, qpdf will attempt to make this easier for you when
+interact with qpdf via its command line interface. The PDF specification
+requires passwords used to encrypt files with 40-bit or 128-bit
+encryption to be encoded with PDF Doc encoding. This encoding is a
+single-byte encoding that supports ISO-Latin-1 and a handful of other
+commonly used characters. It has a large overlap with Windows ANSI but
+is not exactly the same. There is generally not a way to provide PDF Doc
+encoded strings on the command line. As such, qpdf versions prior to
+8.4.0 would often create PDF files that couldn't be opened with other
+software when given a password with non-ASCII characters to encrypt a
+file with 40-bit or 128-bit encryption. Starting with qpdf 8.4.0, qpdf
+recognizes the encoding of the parameter and transcodes it as needed.
+The rest of this section provides the details about exactly how qpdf
+behaves. Most users will not need to know this information, but it might
+be useful if you have been working around qpdf's old behavior or if you
+are using qpdf to generate encrypted files for testing other PDF
+software.
+
+A note about Windows: when qpdf builds, it attempts to determine what it
+has to do to use ``wmain`` instead of ``main`` on Windows. The ``wmain``
+function is an alternative entry point that receives all arguments as
+UTF-16-encoded strings. When qpdf starts up this way, it converts all
+the strings to UTF-8 encoding and then invokes the regular main. This
+means that, as far as qpdf is concerned, it receives its command-line
+arguments with UTF-8 encoding, just as it would in any modern Linux or
+UNIX environment.
+
+If a file is being encrypted with 40-bit or 128-bit encryption and the
+supplied password is not a valid UTF-8 string, qpdf will fall back to
+the behavior of interpreting the password as a string of bytes. If you
+have old scripts that encrypt files by passing the output of
+@1@command@1@iconv@2@command@2@ to qpdf, you no longer need to do that,
+but if you do, qpdf should still work. The only exception would be for
+the extremely unlikely case of a password that is encoded with a
+single-byte encoding but also happens to be valid UTF-8. Such a password
+would contain strings of even numbers of characters that alternate
+between accented letters and symbols. In the extremely unlikely event
+that you are intentionally using such passwords and qpdf is thwarting
+you by interpreting them as UTF-8, you can use
+@1@option@1@--password-mode=bytes@2@option@2@ to suppress qpdf's
+automatic behavior.
+
+The @1@option@1@--password-mode@2@option@2@ option, as described earlier
+in this chapter, can be used to change qpdf's interpretation of supplied
+passwords. There are very few reasons to use this option. One would be
+the unlikely case described in the previous paragraph in which the
+supplied password happens to be valid UTF-8 but isn't supposed to be
+UTF-8. Your best bet would be just to provide the password as a valid
+UTF-8 string, but you could also use
+@1@option@1@--password-mode=bytes@2@option@2@. Another reason to use
+@1@option@1@--password-mode=bytes@2@option@2@ would be to intentionally
+generate PDF files encrypted with passwords that are not properly
+encoded. The qpdf test suite does this to generate invalid files for the
+purpose of testing its password recovery capability. If you were trying
+to create intentionally incorrect files for a similar purposes, the
+@1@option@1@bytes@2@option@2@ password mode can enable you to do this.
+
+When qpdf attempts to decrypt a file with a password that contains
+non-ASCII characters, it will generate a list of alternative passwords
+by attempting to interpret the password as each of a handful of
+different coding systems and then transcode them to the required format.
+This helps to compensate for the supplied password being given in the
+wrong coding system, such as would happen if you used the
+@1@command@1@iconv@2@command@2@ workaround that was previously needed.
+It also generates passwords by doing the reverse operation: translating
+from correct in incorrect encoding of the password. This would enable
+qpdf to decrypt files using passwords that were improperly encoded by
+whatever software encrypted the files, including older versions of qpdf
+invoked without properly encoded passwords. The combination of these two
+recovery methods should make qpdf transparently open most encrypted
+files with the password supplied correctly but in the wrong coding
+system. There are no real downsides to this behavior, but if you don't
+want qpdf to do this, you can use the
+@1@option@1@--suppress-password-recovery@2@option@2@ option. One reason
+to do that is to ensure that you know the exact password that was used
+to encrypt the file.
+
+With these changes, qpdf now generates compliant passwords in most
+cases. There are still some exceptions. In particular, the PDF
+specification directs compliant writers to normalize Unicode passwords
+and to perform certain transformations on passwords with bidirectional
+text. Implementing this functionality requires using a real Unicode
+library like ICU. If a client application that uses qpdf wants to do
+this, the qpdf library will accept the resulting passwords, but qpdf
+will not perform these transformations itself. It is possible that this
+will be addressed in a future version of qpdf. The ``QPDFWriter``
+methods that enable encryption on the output file accept passwords as
+strings of bytes.
+
+Please note that the @1@option@1@--password-is-hex-key@2@option@2@
+option is unrelated to all this. This flag bypasses the normal process
+of going from password to encryption string entirely, allowing the raw
+encryption key to be specified directly. This is useful for forensic
+purposes or for brute-force recovery of files with unknown passwords.
+
+.. _ref.qdf:
+
+QDF Mode
+========
+
+In QDF mode, qpdf creates PDF files in what we call @1@firstterm@1@QDF
+form@2@firstterm@2@. A PDF file in QDF form, sometimes called a QDF
+file, is a completely valid PDF file that has ``%QDF-1.0`` as its third
+line (after the pdf header and binary characters) and has certain other
+characteristics. The purpose of QDF form is to make it possible to edit
+PDF files, with some restrictions, in an ordinary text editor. This can
+be very useful for experimenting with different PDF constructs or for
+making one-off edits to PDF files (though there are other reasons why
+this may not always work). Note that QDF mode does not support
+linearized files. If you enable linearization, QDF mode is automatically
+disabled.
+
+It is ordinarily very difficult to edit PDF files in a text editor for
+two reasons: most meaningful data in PDF files is compressed, and PDF
+files are full of offset and length information that makes it hard to
+add or remove data. A QDF file is organized in a manner such that, if
+edits are kept within certain constraints, the
+@1@command@1@fix-qdf@2@command@2@ program, distributed with qpdf, is
+able to restore edited files to a correct state. The
+@1@command@1@fix-qdf@2@command@2@ program takes no command-line
+arguments. It reads a possibly edited QDF file from standard input and
+writes a repaired file to standard output.
+
+The following attributes characterize a QDF file:
+
+- All objects appear in numerical order in the PDF file, including when
+ objects appear in object streams.
+
+- Objects are printed in an easy-to-read format, and all line endings
+ are normalized to UNIX line endings.
+
+- Unless specifically overridden, streams appear uncompressed (when
+ qpdf supports the filters and they are compressed with a non-lossy
+ compression scheme), and most content streams are normalized (line
+ endings are converted to just a UNIX-style linefeeds).
+
+- All streams lengths are represented as indirect objects, and the
+ stream length object is always the next object after the stream. If
+ the stream data does not end with a newline, an extra newline is
+ inserted, and a special comment appears after the stream indicating
+ that this has been done.
+
+- If the PDF file contains object streams, if object stream *n*
+ contains *k* objects, those objects are numbered from *n+1* through
+ *n+k*, and the object number/offset pairs appear on a separate line
+ for each object. Additionally, each object in the object stream is
+ preceded by a comment indicating its object number and index. This
+ makes it very easy to find objects in object streams.
+
+- All beginnings of objects, ``stream`` tokens, ``endstream`` tokens,
+ and ``endobj`` tokens appear on lines by themselves. A blank line
+ follows every ``endobj`` token.
+
+- If there is a cross-reference stream, it is unfiltered.
+
+- Page dictionaries and page content streams are marked with special
+ comments that make them easy to find.
+
+- Comments precede each object indicating the object number of the
+ corresponding object in the original file.
+
+When editing a QDF file, any edits can be made as long as the above
+constraints are maintained. This means that you can freely edit a page's
+content without worrying about messing up the QDF file. It is also
+possible to add new objects so long as those objects are added after the
+last object in the file or subsequent objects are renumbered. If a QDF
+file has object streams in it, you can always add the new objects before
+the xref stream and then change the number of the xref stream, since
+nothing generally ever references it by number.
+
+It is not generally practical to remove objects from QDF files without
+messing up object numbering, but if you remove all references to an
+object, you can run qpdf on the file (after running
+@1@command@1@fix-qdf@2@command@2@), and qpdf will omit the now-orphaned
+object.
+
+When @1@command@1@fix-qdf@2@command@2@ is run, it goes through the file
+and recomputes the following parts of the file:
+
+- the ``/N``, ``/W``, and ``/First`` keys of all object stream
+ dictionaries
+
+- the pairs of numbers representing object numbers and offsets of
+ objects in object streams
+
+- all stream lengths
+
+- the cross-reference table or cross-reference stream
+
+- the offset to the cross-reference table or cross-reference stream
+ following the ``startxref`` token
+
+.. _ref.using-library:
+
+Using the QPDF Library
+======================
+
+.. _ref.using.from-cxx:
+
+Using QPDF from C++
+-------------------
+
+The source tree for the qpdf package has an
+@1@filename@1@examples@2@filename@2@ directory that contains a few
+example programs. The @1@filename@1@qpdf/qpdf.cc@2@filename@2@ source
+file also serves as a useful example since it exercises almost all of
+the qpdf library's public interface. The best source of documentation on
+the library itself is reading comments in
+@1@filename@1@include/qpdf/QPDF.hh@2@filename@2@,
+@1@filename@1@include/qpdf/QPDFWriter.hh@2@filename@2@, and
+@1@filename@1@include/qpdf/QPDFObjectHandle.hh@2@filename@2@.
+
+All header files are installed in the
+@1@filename@1@include/qpdf@2@filename@2@ directory. It is recommend that
+you use ``#include
+ <qpdf/QPDF.hh>`` rather than adding
+@1@filename@1@include/qpdf@2@filename@2@ to your include path.
+
+When linking against the qpdf static library, you may also need to
+specify ``-lz -ljpeg`` on your link command. If your system understands
+how to read libtool @1@filename@1@.la@2@filename@2@ files, this may not
+be necessary.
+
+The qpdf library is safe to use in a multithreaded program, but no
+individual ``QPDF`` object instance (including ``QPDF``,
+``QPDFObjectHandle``, or ``QPDFWriter``) can be used in more than one
+thread at a time. Multiple threads may simultaneously work with
+different instances of these and all other QPDF objects.
+
+.. _ref.using.other-languages:
+
+Using QPDF from other languages
+-------------------------------
+
+The qpdf library is implemented in C++, which makes it hard to use
+directly in other languages. There are a few things that can help.
+
+"C"
+ The qpdf library includes a "C" language interface that provides a
+ subset of the overall capabilities. The header file
+ @1@filename@1@qpdf/qpdf-c.h@2@filename@2@ includes information about
+ its use. As long as you use a C++ linker, you can link C programs
+ with qpdf and use the C API. For languages that can directly load
+ methods from a shared library, the C API can also be useful. People
+ have reported success using the C API from other languages on Windows
+ by directly calling functions in the DLL.
+
+Python
+ A Python module called
+ `pikepdf <https://pypi.org/project/pikepdf/>`__ provides a clean and
+ highly functional set of Python bindings to the qpdf library. Using
+ pikepdf, you can work with PDF files in a natural way and combine
+ qpdf's capabilities with other functionality provided by Python's
+ rich standard library and available modules.
+
+Other Languages
+ Starting with version 8.3.0, the @1@command@1@qpdf@2@command@2@
+ command-line tool can produce a JSON representation of the PDF file's
+ non-content data. This can facilitate interacting programmatically
+ with PDF files through qpdf's command line interface. For more
+ information, please see `QPDF JSON <#ref.json>`__.
+
+.. _ref.unicode-files:
+
+A Note About Unicode File Names
+-------------------------------
+
+When strings are passed to qpdf library routines either as ``char*`` or
+as ``std::string``, they are treated as byte arrays except where
+otherwise noted. When Unicode is desired, qpdf wants UTF-8 unless
+otherwise noted in comments in header files. In modern UNIX/Linux
+environments, this generally does the right thing. In Windows, it's a
+bit more complicated. Starting in qpdf 8.4.0, passwords that contain
+Unicode characters are handled much better, and starting in qpdf 8.4.1,
+the library attempts to properly handle Unicode characters in filenames.
+In particular, in Windows, if a UTF-8 encoded string is used as a
+filename in either ``QPDF`` or ``QPDFWriter``, it is internally
+converted to ``wchar_t*``, and Unicode-aware Windows APIs are used. As
+such, qpdf will generally operate properly on files with non-ASCII
+characters in their names as long as the filenames are UTF-8 encoded for
+passing into the qpdf library API, but there are still some rough edges,
+such as the encoding of the filenames in error messages our CLI output
+messages. Patches or bug reports are welcome for any continuing issues
+with Unicode file names in Windows.
+
+.. _ref.weak-crypto:
+
+Weak Cryptography
+=================
+
+Start with version 10.4, qpdf is taking steps to reduce the likelihood
+of a user *accidentally* creating PDF files with insecure cryptography
+but will continue to allow creation of such files indefinitely with
+explicit acknowledgment.
+
+The PDF file format makes use of RC4, which is known to be a weak
+cryptography algorithm, and MD5, which is a weak hashing algorithm. In
+version 10.4, qpdf generates warnings for some (but not all) cases of
+writing files with weak cryptography when invoked from the command-line.
+These warnings can be suppressed using the
+@1@option@1@--allow-weak-crypto@2@option@2@ option.
+
+It is planned for qpdf version 11 to be stricter, making it an error to
+write files with insecure cryptography from the command-line tool in
+most cases without specifying the
+@1@option@1@--allow-weak-crypto@2@option@2@ flag and also to require
+explicit steps when using the C++ library to enable use of insecure
+cryptography.
+
+Note that qpdf must always retain support for weak cryptographic
+algorithms since this is required for reading older PDF files that use
+it. Additionally, qpdf will always retain the ability to create files
+using weak cryptographic algorithms since, as a development tool, qpdf
+explicitly supports creating older or deprecated types of PDF files
+since these are sometimes needed to test or work with older versions of
+software. Even if other cryptography libraries drop support for RC4 or
+MD5, qpdf can always fall back to its internal implementations of those
+algorithms, so they are not going to disappear from qpdf.
+
+.. _ref.json:
+
+QPDF JSON
+=========
+
+.. _ref.json-overview:
+
+Overview
+--------
+
+Beginning with qpdf version 8.3.0, the @1@command@1@qpdf@2@command@2@
+command-line program can produce a JSON representation of the
+non-content data in a PDF file. It includes a dump in JSON format of all
+objects in the PDF file excluding the content of streams. This JSON
+representation makes it very easy to look in detail at the structure of
+a given PDF file, and it also provides a great way to work with PDF
+files programmatically from the command-line in languages that can't
+call or link with the qpdf library directly. Note that stream data can
+be extracted from PDF files using other qpdf command-line options.
+
+.. _ref.json-guarantees:
+
+JSON Guarantees
+---------------
+
+The qpdf JSON representation includes a JSON serialization of the raw
+objects in the PDF file as well as some computed information in a more
+easily extracted format. QPDF provides some guarantees about its JSON
+format. These guarantees are designed to simplify the experience of a
+developer working with the JSON format.
+
+Compatibility
+ The top-level JSON object output is a dictionary. The JSON output
+ contains various nested dictionaries and arrays. With the exception
+ of dictionaries that are populated by the fields of objects from the
+ file, all instances of a dictionary are guaranteed to have exactly
+ the same keys. Future versions of qpdf are free to add additional
+ keys but not to remove keys or change the type of object that a key
+ points to. The qpdf program validates this guarantee, and in the
+ unlikely event that a bug in qpdf should cause it to generate data
+ that doesn't conform to this rule, it will ask you to file a bug
+ report.
+
+ The top-level JSON structure contains a "``version``" key whose value
+ is simple integer. The value of the ``version`` key will be
+ incremented if a non-compatible change is made. A non-compatible
+ change would be any change that involves removal of a key, a change
+ to the format of data pointed to by a key, or a semantic change that
+ requires a different interpretation of a previously existing key. A
+ strong effort will be made to avoid breaking compatibility.
+
+Documentation
+ The @1@command@1@qpdf@2@command@2@ command can be invoked with the
+ @1@option@1@--json-help@2@option@2@ option. This will output a JSON
+ structure that has the same structure as the JSON output that qpdf
+ generates, except that each field in the help output is a description
+ of the corresponding field in the JSON output. The specific
+ guarantees are as follows:
+
+ - A dictionary in the help output means that the corresponding
+ location in the actual JSON output is also a dictionary with
+ exactly the same keys; that is, no keys present in help are absent
+ in the real output, and no keys will be present in the real output
+ that are not in help. As a special case, if the dictionary has a
+ single key whose name starts with ``<`` and ends with ``>``, it
+ means that the JSON output is a dictionary that can have any keys,
+ each of which conforms to the value of the special key. This is
+ used for cases in which the keys of the dictionary are things like
+ object IDs.
+
+ - A string in the help output is a description of the item that
+ appears in the corresponding location of the actual output. The
+ corresponding output can have any format.
+
+ - An array in the help output always contains a single element. It
+ indicates that the corresponding location in the actual output is
+ also an array, and that each element of the array has whatever
+ format is implied by the single element of the help output's
+ array.
+
+ For example, the help output indicates includes a "``pagelabels``"
+ key whose value is an array of one element. That element is a
+ dictionary with keys "``index``" and "``label``". In addition to
+ describing the meaning of those keys, this tells you that the actual
+ JSON output will contain a ``pagelabels`` array, each of whose
+ elements is a dictionary that contains an ``index`` key, a ``label``
+ key, and no other keys.
+
+Directness and Simplicity
+ The JSON output contains the value of every object in the file, but
+ it also contains some processed data. This is analogous to how qpdf's
+ library interface works. The processed data is similar to the helper
+ functions in that it allows you to look at certain aspects of the PDF
+ file without having to understand all the nuances of the PDF
+ specification, while the raw objects allow you to mine the PDF for
+ anything that the higher-level interfaces are lacking.
+
+.. _json.limitations:
+
+Limitations of JSON Representation
+----------------------------------
+
+There are a few limitations to be aware of with the JSON structure:
+
+- Strings, names, and indirect object references in the original PDF
+ file are all converted to strings in the JSON representation. In the
+ case of a "normal" PDF file, you can tell the difference because a
+ name starts with a slash (``/``), and an indirect object reference
+ looks like ``n n R``, but if there were to be a string that looked
+ like a name or indirect object reference, there would be no way to
+ tell this from the JSON output. Note that there are certain cases
+ where you know for sure what something is, such as knowing that
+ dictionary keys in objects are always names and that certain things
+ in the higher-level computed data are known to contain indirect
+ object references.
+
+- The JSON format doesn't support binary data very well. Mostly the
+ details are not important, but they are presented here for
+ information. When qpdf outputs a string in the JSON representation,
+ it converts the string to UTF-8, assuming usual PDF string semantics.
+ Specifically, if the original string is UTF-16, it is converted to
+ UTF-8. Otherwise, it is assumed to have PDF doc encoding, and is
+ converted to UTF-8 with that assumption. This causes strange things
+ to happen to binary strings. For example, if you had the binary
+ string ``<038051>``, this would be output to the JSON as ``\u0003•Q``
+ because ``03`` is not a printable character and ``80`` is the bullet
+ character in PDF doc encoding and is mapped to the Unicode value
+ ``2022``. Since ``51`` is ``Q``, it is output as is. If you wanted to
+ convert back from here to a binary string, would have to recognize
+ Unicode values whose code points are higher than ``0xFF`` and map
+ those back to their corresponding PDF doc encoding characters. There
+ is no way to tell the difference between a Unicode string that was
+ originally encoded as UTF-16 or one that was converted from PDF doc
+ encoding. In other words, it's best if you don't try to use the JSON
+ format to extract binary strings from the PDF file, but if you really
+ had to, it could be done. Note that qpdf's
+ @1@option@1@--show-object@2@option@2@ option does not have this
+ limitation and will reveal the string as encoded in the original
+ file.
+
+.. _json.considerations:
+
+JSON: Special Considerations
+----------------------------
+
+For the most part, the built-in JSON help tells you everything you need
+to know about the JSON format, but there are a few non-obvious things to
+be aware of:
+
+- While qpdf guarantees that keys present in the help will be present
+ in the output, those fields may be null or empty if the information
+ is not known or absent in the file. Also, if you specify
+ @1@option@1@--json-keys@2@option@2@, the keys that are not listed
+ will be excluded entirely except for those that
+ @1@option@1@--json-help@2@option@2@ says are always present.
+
+- In a few places, there are keys with names containing
+ ``pageposfrom1``. The values of these keys are null or an integer. If
+ an integer, they point to a page index within the file numbering from
+ 1. Note that JSON indexes from 0, and you would also use 0-based
+ indexing using the API. However, 1-based indexing is easier in this
+ case because the command-line syntax for specifying page ranges is
+ 1-based. If you were going to write a program that looked through the
+ JSON for information about specific pages and then use the
+ command-line to extract those pages, 1-based indexing is easier.
+ Besides, it's more convenient to subtract 1 from a program in a real
+ programming language than it is to add 1 from shell code.
+
+- The image information included in the ``page`` section of the JSON
+ output includes the key "``filterable``". Note that the value of this
+ field may depend on the @1@option@1@--decode-level@2@option@2@ that
+ you invoke qpdf with. The JSON output includes a top-level key
+ "``parameters``" that indicates the decode level used for computing
+ whether a stream was filterable. For example, jpeg images will be
+ shown as not filterable by default, but they will be shown as
+ filterable if you run @1@command@1@qpdf --json
+ --decode-level=all@2@command@2@.
+
+.. _ref.design:
+
+Design and Library Notes
+========================
+
+.. _ref.design.intro:
+
+Introduction
+------------
+
+This section was written prior to the implementation of the qpdf package
+and was subsequently modified to reflect the implementation. In some
+cases, for purposes of explanation, it may differ slightly from the
+actual implementation. As always, the source code and test suite are
+authoritative. Even if there are some errors, this document should serve
+as a road map to understanding how this code works.
+
+In general, one should adhere strictly to a specification when writing
+but be liberal in reading. This way, the product of our software will be
+accepted by the widest range of other programs, and we will accept the
+widest range of input files. This library attempts to conform to that
+philosophy whenever possible but also aims to provide strict checking
+for people who want to validate PDF files. If you don't want to see
+warnings and are trying to write something that is tolerant, you can
+call ``setSuppressWarnings(true)``. If you want to fail on the first
+error, you can call ``setAttemptRecovery(false)``. The default behavior
+is to generating warnings for recoverable problems. Note that recovery
+will not always produce the desired results even if it is able to get
+through the file. Unlike most other PDF files that produce generic
+warnings such as "This file is damaged,", qpdf generally issues a
+detailed error message that would be most useful to a PDF developer.
+This is by design as there seems to be a shortage of PDF validation
+tools out there. This was, in fact, one of the major motivations behind
+the initial creation of qpdf.
+
+.. _ref.design-goals:
+
+Design Goals
+------------
+
+The QPDF package includes support for reading and rewriting PDF files.
+It aims to hide from the user details involving object locations,
+modified (appended) PDF files, the directness/indirectness of objects,
+and stream filters including encryption. It does not aim to hide
+knowledge of the object hierarchy or content stream contents. Put
+another way, a user of the qpdf library is expected to have knowledge
+about how PDF files work, but is not expected to have to keep track of
+bookkeeping details such as file positions.
+
+A user of the library never has to care whether an object is direct or
+indirect, though it is possible to determine whether an object is direct
+or not if this information is needed. All access to objects deals with
+this transparently. All memory management details are also handled by
+the library.
+
+The ``PointerHolder`` object is used internally by the library to deal
+with memory management. This is basically a smart pointer object very
+similar in spirit to C++-11's ``std::shared_ptr`` object, but predating
+it by several years. This library also makes use of a technique for
+giving fine-grained access to methods in one class to other classes by
+using public subclasses with friends and only private members that in
+turn call private methods of the containing class. See
+``QPDFObjectHandle::Factory`` as an example.
+
+The top-level qpdf class is ``QPDF``. A ``QPDF`` object represents a PDF
+file. The library provides methods for both accessing and mutating PDF
+files.
+
+The primary class for interacting with PDF objects is
+``QPDFObjectHandle``. Instances of this class can be passed around by
+value, copied, stored in containers, etc. with very low overhead.
+Instances of ``QPDFObjectHandle`` created by reading from a file will
+always contain a reference back to the ``QPDF`` object from which they
+were created. A ``QPDFObjectHandle`` may be direct or indirect. If
+indirect, the ``QPDFObject`` the ``PointerHolder`` initially points to
+is a null pointer. In this case, the first attempt to access the
+underlying ``QPDFObject`` will result in the ``QPDFObject`` being
+resolved via a call to the referenced ``QPDF`` instance. This makes it
+essentially impossible to make coding errors in which certain things
+will work for some PDF files and not for others based on which objects
+are direct and which objects are indirect.
+
+Instances of ``QPDFObjectHandle`` can be directly created and modified
+using static factory methods in the ``QPDFObjectHandle`` class. There
+are factory methods for each type of object as well as a convenience
+method ``QPDFObjectHandle::parse`` that creates an object from a string
+representation of the object. Existing instances of ``QPDFObjectHandle``
+can also be modified in several ways. See comments in
+@1@filename@1@QPDFObjectHandle.hh@2@filename@2@ for details.
+
+An instance of ``QPDF`` is constructed by using the class's default
+constructor. If desired, the ``QPDF`` object may be configured with
+various methods that change its default behavior. Then the
+``QPDF::processFile()`` method is passed the name of a PDF file, which
+permanently associates the file with that QPDF object. A password may
+also be given for access to password-protected files. QPDF does not
+enforce encryption parameters and will treat user and owner passwords
+equivalently. Either password may be used to access an encrypted file.
+``QPDF`` will allow recovery of a user password given an owner password.
+The input PDF file must be seekable. (Output files written by
+``QPDFWriter`` need not be seekable, even when creating linearized
+files.) During construction, ``QPDF`` validates the PDF file's header,
+and then reads the cross reference tables and trailer dictionaries. The
+``QPDF`` class keeps only the first trailer dictionary though it does
+read all of them so it can check the ``/Prev`` key. ``QPDF`` class users
+may request the root object and the trailer dictionary specifically. The
+cross reference table is kept private. Objects may then be requested by
+number of by walking the object tree.
+
+When a PDF file has a cross-reference stream instead of a
+cross-reference table and trailer, requesting the document's trailer
+dictionary returns the stream dictionary from the cross-reference stream
+instead.
+
+There are some convenience routines for very common operations such as
+walking the page tree and returning a vector of all page objects. For
+full details, please see the header files
+@1@filename@1@QPDF.hh@2@filename@2@ and
+@1@filename@1@QPDFObjectHandle.hh@2@filename@2@. There are also some
+additional helper classes that provide higher level API functions for
+certain document constructions. These are discussed in `Helper
+Classes <#ref.helper-classes>`__.
+
+.. _ref.helper-classes:
+
+Helper Classes
+--------------
+
+QPDF version 8.1 introduced the concept of helper classes. Helper
+classes are intended to contain higher level APIs that allow developers
+to work with certain document constructs at an abstraction level above
+that of ``QPDFObjectHandle`` while staying true to qpdf's philosophy of
+not hiding document structure from the developer. As with qpdf in
+general, the goal is take away some of the more tedious bookkeeping
+aspects of working with PDF files, not to remove the need for the
+developer to understand how the PDF construction in question works. The
+driving factor behind the creation of helper classes was to allow the
+evolution of higher level interfaces in qpdf without polluting the
+interfaces of the main top-level classes ``QPDF`` and
+``QPDFObjectHandle``.
+
+There are two kinds of helper classes: *document* helpers and *object*
+helpers. Document helpers are constructed with a reference to a ``QPDF``
+object and provide methods for working with structures that are at the
+document level. Object helpers are constructed with an instance of a
+``QPDFObjectHandle`` and provide methods for working with specific types
+of objects.
+
+Examples of document helpers include ``QPDFPageDocumentHelper``, which
+contains methods for operating on the document's page trees, such as
+enumerating all pages of a document and adding and removing pages; and
+``QPDFAcroFormDocumentHelper``, which contains document-level methods
+related to interactive forms, such as enumerating form fields and
+creating mappings between form fields and annotations.
+
+Examples of object helpers include ``QPDFPageObjectHelper`` for
+performing operations on pages such as page rotation and some operations
+on content streams, ``QPDFFormFieldObjectHelper`` for performing
+operations related to interactive form fields, and
+``QPDFAnnotationObjectHelper`` for working with annotations.
+
+It is always possible to retrieve the underlying ``QPDF`` reference from
+a document helper and the underlying ``QPDFObjectHandle`` reference from
+an object helper. Helpers are designed to be helpers, not wrappers. The
+intention is that, in general, it is safe to freely intermix operations
+that use helpers with operations that use the underlying objects.
+Document and object helpers do not attempt to provide a complete
+interface for working with the things they are helping with, nor do they
+attempt to encapsulate underlying structures. They just provide a few
+methods to help with error-prone, repetitive, or complex tasks. In some
+cases, a helper object may cache some information that is expensive to
+gather. In such cases, the helper classes are implemented so that their
+own methods keep the cache consistent, and the header file will provide
+a method to invalidate the cache and a description of what kinds of
+operations would make the cache invalid. If in doubt, you can always
+discard a helper class and create a new one with the same underlying
+objects, which will ensure that you have discarded any stale
+information.
+
+By Convention, document helpers are called
+``QPDFSomethingDocumentHelper`` and are derived from
+``QPDFDocumentHelper``, and object helpers are called
+``QPDFSomethingObjectHelper`` and are derived from ``QPDFObjectHelper``.
+For details on specific helpers, please see their header files. You can
+find them by looking at
+@1@filename@1@include/qpdf/QPDF*DocumentHelper.hh@2@filename@2@ and
+@1@filename@1@include/qpdf/QPDF*ObjectHelper.hh@2@filename@2@.
+
+In order to avoid creation of circular dependencies, the following
+general guidelines are followed with helper classes:
+
+- Core class interfaces do not know about helper classes. For example,
+ no methods of ``QPDF`` or ``QPDFObjectHandle`` will include helper
+ classes in their interfaces.
+
+- Interfaces of object helpers will usually not use document helpers in
+ their interfaces. This is because it is much more useful for document
+ helpers to have methods that return object helpers. Most operations
+ in PDF files start at the document level and go from there to the
+ object level rather than the other way around. It can sometimes be
+ useful to map back from object-level structures to document-level
+ structures. If there is a desire to do this, it will generally be
+ provided by a method in the document helper class.
+
+- Most of the time, object helpers don't know about other object
+ helpers. However, in some cases, one type of object may be a
+ container for another type of object, in which case it may make sense
+ for the outer object to know about the inner object. For example,
+ there are methods in the ``QPDFPageObjectHelper`` that know
+ ``QPDFAnnotationObjectHelper`` because references to annotations are
+ contained in page dictionaries.
+
+- Any helper or core library class may use helpers in their
+ implementations.
+
+Prior to qpdf version 8.1, higher level interfaces were added as
+"convenience functions" in either ``QPDF`` or ``QPDFObjectHandle``. For
+compatibility, older convenience functions for operating with pages will
+remain in those classes even as alternatives are provided in helper
+classes. Going forward, new higher level interfaces will be provided
+using helper classes.
+
+.. _ref.implementation-notes:
+
+Implementation Notes
+--------------------
+
+This section contains a few notes about QPDF's internal implementation,
+particularly around what it does when it first processes a file. This
+section is a bit of a simplification of what it actually does, but it
+could serve as a starting point to someone trying to understand the
+implementation. There is nothing in this section that you need to know
+to use the qpdf library.
+
+``QPDFObject`` is the basic PDF Object class. It is an abstract base
+class from which are derived classes for each type of PDF object.
+Clients do not interact with Objects directly but instead interact with
+``QPDFObjectHandle``.
+
+When the ``QPDF`` class creates a new object, it dynamically allocates
+the appropriate type of ``QPDFObject`` and immediately hands the pointer
+to an instance of ``QPDFObjectHandle``. The parser reads a token from
+the current file position. If the token is a not either a dictionary or
+array opener, an object is immediately constructed from the single token
+and the parser returns. Otherwise, the parser iterates in a special mode
+in which it accumulates objects until it finds a balancing closer.
+During this process, the "``R``" keyword is recognized and an indirect
+``QPDFObjectHandle`` may be constructed.
+
+The ``QPDF::resolve()`` method, which is used to resolve an indirect
+object, may be invoked from the ``QPDFObjectHandle`` class. It first
+checks a cache to see whether this object has already been read. If not,
+it reads the object from the PDF file and caches it. It the returns the
+resulting ``QPDFObjectHandle``. The calling object handle then replaces
+its ``PointerHolder<QDFObject>`` with the one from the newly returned
+``QPDFObjectHandle``. In this way, only a single copy of any direct
+object need exist and clients can access objects transparently without
+knowing caring whether they are direct or indirect objects.
+Additionally, no object is ever read from the file more than once. That
+means that only the portions of the PDF file that are actually needed
+are ever read from the input file, thus allowing the qpdf package to
+take advantage of this important design goal of PDF files.
+
+If the requested object is inside of an object stream, the object stream
+itself is first read into memory. Then the tokenizer reads objects from
+the memory stream based on the offset information stored in the stream.
+Those individual objects are cached, after which the temporary buffer
+holding the object stream contents are discarded. In this way, the first
+time an object in an object stream is requested, all objects in the
+stream are cached.
+
+The following example should clarify how ``QPDF`` processes a simple
+file.
+
+- Client constructs ``QPDF`` ``pdf`` and calls
+ ``pdf.processFile("a.pdf");``.
+
+- The ``QPDF`` class checks the beginning of
+ @1@filename@1@a.pdf@2@filename@2@ for a PDF header. It then reads the
+ cross reference table mentioned at the end of the file, ensuring that
+ it is looking before the last ``%%EOF``. After getting to ``trailer``
+ keyword, it invokes the parser.
+
+- The parser sees "``<<``", so it calls itself recursively in
+ dictionary creation mode.
+
+- In dictionary creation mode, the parser keeps accumulating objects
+ until it encounters "``>>``". Each object that is read is pushed onto
+ a stack. If "``R``" is read, the last two objects on the stack are
+ inspected. If they are integers, they are popped off the stack and
+ their values are used to construct an indirect object handle which is
+ then pushed onto the stack. When "``>>``" is finally read, the stack
+ is converted into a ``QPDF_Dictionary`` which is placed in a
+ ``QPDFObjectHandle`` and returned.
+
+- The resulting dictionary is saved as the trailer dictionary.
+
+- The ``/Prev`` key is searched. If present, ``QPDF`` seeks to that
+ point and repeats except that the new trailer dictionary is not
+ saved. If ``/Prev`` is not present, the initial parsing process is
+ complete.
+
+ If there is an encryption dictionary, the document's encryption
+ parameters are initialized.
+
+- The client requests root object. The ``QPDF`` class gets the value of
+ root key from trailer dictionary and returns it. It is an unresolved
+ indirect ``QPDFObjectHandle``.
+
+- The client requests the ``/Pages`` key from root
+ ``QPDFObjectHandle``. The ``QPDFObjectHandle`` notices that it is
+ indirect so it asks ``QPDF`` to resolve it. ``QPDF`` looks in the
+ object cache for an object with the root dictionary's object ID and
+ generation number. Upon not seeing it, it checks the cross reference
+ table, gets the offset, and reads the object present at that offset.
+ It stores the result in the object cache and returns the cached
+ result. The calling ``QPDFObjectHandle`` replaces its object pointer
+ with the one from the resolved ``QPDFObjectHandle``, verifies that it
+ a valid dictionary object, and returns the (unresolved indirect)
+ ``QPDFObject`` handle to the top of the Pages hierarchy.
+
+ As the client continues to request objects, the same process is
+ followed for each new requested object.
+
+.. _ref.casting:
+
+Casting Policy
+--------------
+
+This section describes the casting policy followed by qpdf's
+implementation. This is no concern to qpdf's end users and largely of no
+concern to people writing code that uses qpdf, but it could be of
+interest to people who are porting qpdf to a new platform or who are
+making modifications to the code.
+
+The C++ code in qpdf is free of old-style casts except where unavoidable
+(e.g. where the old-style cast is in a macro provided by a third-party
+header file). When there is a need for a cast, it is handled, in order
+of preference, by rewriting the code to avoid the need for a cast,
+calling ``const_cast``, calling ``static_cast``, calling
+``reinterpret_cast``, or calling some combination of the above. As a
+last resort, a compiler-specific ``#pragma`` may be used to suppress a
+warning that we don't want to fix. Examples may include suppressing
+warnings about the use of old-style casts in code that is shared between
+C and C++ code.
+
+The ``QIntC`` namespace, provided by
+@1@filename@1@include/qpdf/QIntC.hh@2@filename@2@, implements safe
+functions for converting between integer types. These functions do range
+checking and throw a ``std::range_error``, which is subclass of
+``std::runtime_error``, if conversion from one integer type to another
+results in loss of information. There are many cases in which we have to
+move between different integer types because of incompatible integer
+types used in interoperable interfaces. Some are unavoidable, such as
+moving between sizes and offsets, and others are there because of old
+code that is too in entrenched to be fixable without breaking source
+compatibility and causing pain for users. QPDF is compiled with extra
+warnings to detect conversions with potential data loss, and all such
+cases should be fixed by either using a function from ``QIntC`` or a
+``static_cast``.
+
+When the intention is just to switch the type because of exchanging data
+between incompatible interfaces, use ``QIntC``. This is the usual case.
+However, there are some cases in which we are explicitly intending to
+use the exact same bit pattern with a different type. This is most
+common when switching between signed and unsigned characters. A lot of
+qpdf's code uses unsigned characters internally, but ``std::string`` and
+``char`` are signed. Using ``QIntC::to_char`` would be wrong for
+converting from unsigned to signed characters because a negative
+``char`` value and the corresponding ``unsigned
+ char`` value greater than 127 *mean the same thing*. There are also
+cases in which we use ``static_cast`` when working with bit fields where
+we are not representing a numerical value but rather a bunch of bits
+packed together in some integer type. Also note that ``size_t`` and
+``long`` both typically differ between 32-bit and 64-bit environments,
+so sometimes an explicit cast may not be needed to avoid warnings on one
+platform but may be needed on another. A conversion with ``QIntC``
+should always be used when the types are different even if the
+underlying size is the same. QPDF's CI build builds on 32-bit and 64-bit
+platforms, and the test suite is very thorough, so it is hard to make
+any of the potential errors here without being caught in build or test.
+
+Non-const ``unsigned char*`` is used in the ``Pipeline`` interface. The
+pipeline interface has a ``write`` call that uses ``unsigned
+ char*`` without a ``const`` qualifier. The main reason for this is
+to support pipelines that make calls to third-party libraries, such as
+zlib, that don't include ``const`` in their interfaces. Unfortunately,
+there are many places in the code where it is desirable to have ``const
+ char*`` with pipelines. None of the pipeline implementations in qpdf
+currently modify the data passed to write, and doing so would be counter
+to the intent of ``Pipeline``, but there is nothing in the code to
+prevent this from being done. There are places in the code where
+``const_cast`` is used to remove the const-ness of pointers going into
+``Pipeline``\ s. This could theoretically be unsafe, but there is
+adequate testing to assert that it is safe and will remain safe in
+qpdf's code.
+
+.. _ref.encryption:
+
+Encryption
+----------
+
+Encryption is supported transparently by qpdf. When opening a PDF file,
+if an encryption dictionary exists, the ``QPDF`` object processes this
+dictionary using the password (if any) provided. The primary decryption
+key is computed and cached. No further access is made to the encryption
+dictionary after that time. When an object is read from a file, the
+object ID and generation of the object in which it is contained is
+always known. Using this information along with the stored encryption
+key, all stream and string objects are transparently decrypted. Raw
+encrypted objects are never stored in memory. This way, nothing in the
+library ever has to know or care whether it is reading an encrypted
+file.
+
+An interface is also provided for writing encrypted streams and strings
+given an encryption key. This is used by ``QPDFWriter`` when it rewrites
+encrypted files.
+
+When copying encrypted files, unless otherwise directed, qpdf will
+preserve any encryption in force in the original file. qpdf can do this
+with either the user or the owner password. There is no difference in
+capability based on which password is used. When 40 or 128 bit
+encryption keys are used, the user password can be recovered with the
+owner password. With 256 keys, the user and owner passwords are used
+independently to encrypt the actual encryption key, so while either can
+be used, the owner password can no longer be used to recover the user
+password.
+
+Starting with version 4.0.0, qpdf can read files that are not encrypted
+but that contain encrypted attachments, but it cannot write such files.
+qpdf also requires the password to be specified in order to open the
+file, not just to extract attachments, since once the file is open, all
+decryption is handled transparently. When copying files like this while
+preserving encryption, qpdf will apply the file's encryption to
+everything in the file, not just to the attachments. When decrypting the
+file, qpdf will decrypt the attachments. In general, when copying PDF
+files with multiple encryption formats, qpdf will choose the newest
+format. The only exception to this is that clear-text metadata will be
+preserved as clear-text if it is that way in the original file.
+
+One point of confusion some people have about encrypted PDF files is
+that encryption is not the same as password protection. Password
+protected files are always encrypted, but it is also possible to create
+encrypted files that do not have passwords. Internally, such files use
+the empty string as a password, and most readers try the empty string
+first to see if it works and prompt for a password only if the empty
+string doesn't work. Normally such files have an empty user password and
+a non-empty owner password. In that way, if the file is opened by an
+ordinary reader without specification of password, the restrictions
+specified in the encryption dictionary can be enforced. Most users
+wouldn't even realize such a file was encrypted. Since qpdf always
+ignores the restrictions (except for the purpose of reporting what they
+are), qpdf doesn't care which password you use. QPDF will allow you to
+create PDF files with non-empty user passwords and empty owner
+passwords. Some readers will require a password when you open these
+files, and others will open the files without a password and not enforce
+restrictions. Having a non-empty user password and an empty owner
+password doesn't really make sense because it would mean that opening
+the file with the user password would be more restrictive than not
+supplying a password at all. QPDF also allows you to create PDF files
+with the same password as both the user and owner password. Some readers
+will not ever allow such files to be accessed without restrictions
+because they never try the password as the owner password if it works as
+the user password. Nonetheless, one of the powerful aspects of qpdf is
+that it allows you to finely specify the way encrypted files are
+created, even if the results are not useful to some readers. One use
+case for this would be for testing a PDF reader to ensure that it
+handles odd configurations of input files.
+
+.. _ref.random-numbers:
+
+Random Number Generation
+------------------------
+
+QPDF generates random numbers to support generation of encrypted data.
+Starting in qpdf 10.0.0, qpdf uses the crypto provider as its source of
+random numbers. Older versions used the OS-provided source of secure
+random numbers or, if allowed at build time, insecure random numbers
+from stdlib. Starting with version 5.1.0, you can disable use of
+OS-provided secure random numbers at build time. This is especially
+useful on Windows if you want to avoid a dependency on Microsoft's
+cryptography API. You can also supply your own random data provider. For
+details on how to do this, please refer to the top-level README.md file
+in the source distribution and to comments in
+@1@filename@1@QUtil.hh@2@filename@2@.
+
+.. _ref.adding-and-remove-pages:
+
+Adding and Removing Pages
+-------------------------
+
+While qpdf's API has supported adding and modifying objects for some
+time, version 3.0 introduces specific methods for adding and removing
+pages. These are largely convenience routines that handle two tricky
+issues: pushing inheritable resources from the ``/Pages`` tree down to
+individual pages and manipulation of the ``/Pages`` tree itself. For
+details, see ``addPage`` and surrounding methods in
+@1@filename@1@QPDF.hh@2@filename@2@.
+
+.. _ref.reserved-objects:
+
+Reserving Object Numbers
+------------------------
+
+Version 3.0 of qpdf introduced the concept of reserved objects. These
+are seldom needed for ordinary operations, but there are cases in which
+you may want to add a series of indirect objects with references to each
+other to a ``QPDF`` object. This causes a problem because you can't
+determine the object ID that a new indirect object will have until you
+add it to the ``QPDF`` object with ``QPDF::makeIndirectObject``. The
+only way to add two mutually referential objects to a ``QPDF`` object
+prior to version 3.0 would be to add the new objects first and then make
+them refer to each other after adding them. Now it is possible to create
+a @1@firstterm@1@reserved object@2@firstterm@2@ using
+``QPDFObjectHandle::newReserved``. This is an indirect object that stays
+"unresolved" even if it is queried for its type. So now, if you want to
+create a set of mutually referential objects, you can create
+reservations for each one of them and use those reservations to
+construct the references. When finished, you can call
+``QPDF::replaceReserved`` to replace the reserved objects with the real
+ones. This functionality will never be needed by most applications, but
+it is used internally by QPDF when copying objects from other PDF files,
+as discussed in `Copying Objects From Other PDF
+Files <#ref.foreign-objects>`__. For an example of how to use reserved
+objects, search for ``newReserved`` in
+@1@filename@1@test_driver.cc@2@filename@2@ in qpdf's sources.
+
+.. _ref.foreign-objects:
+
+Copying Objects From Other PDF Files
+------------------------------------
+
+Version 3.0 of qpdf introduced the ability to copy objects into a
+``QPDF`` object from a different ``QPDF`` object, which we refer to as
+@1@firstterm@1@foreign objects@2@firstterm@2@. This allows arbitrary
+merging of PDF files. The "from" ``QPDF`` object must remain valid after
+the copy as discussed in the note below. The
+@1@command@1@qpdf@2@command@2@ command-line tool provides limited
+support for basic page selection, including merging in pages from other
+files, but the library's API makes it possible to implement arbitrarily
+complex merging operations. The main method for copying foreign objects
+is ``QPDF::copyForeignObject``. This takes an indirect object from
+another ``QPDF`` and copies it recursively into this object while
+preserving all object structure, including circular references. This
+means you can add a direct object that you create from scratch to a
+``QPDF`` object with ``QPDF::makeIndirectObject``, and you can add an
+indirect object from another file with ``QPDF::copyForeignObject``. The
+fact that ``QPDF::makeIndirectObject`` does not automatically detect a
+foreign object and copy it is an explicit design decision. Copying a
+foreign object seems like a sufficiently significant thing to do that it
+should be done explicitly.
+
+The other way to copy foreign objects is by passing a page from one
+``QPDF`` to another by calling ``QPDF::addPage``. In contrast to
+``QPDF::makeIndirectObject``, this method automatically distinguishes
+between indirect objects in the current file, foreign objects, and
+direct objects.
+
+Please note: when you copy objects from one ``QPDF`` to another, the
+source ``QPDF`` object must remain valid until you have finished with
+the destination object. This is because the original object is still
+used to retrieve any referenced stream data from the copied object.
+
+.. _ref.rewriting:
+
+Writing PDF Files
+-----------------
+
+The qpdf library supports file writing of ``QPDF`` objects to PDF files
+through the ``QPDFWriter`` class. The ``QPDFWriter`` class has two
+writing modes: one for non-linearized files, and one for linearized
+files. See `Linearization <#ref.linearization>`__ for a description of
+linearization is implemented. This section describes how we write
+non-linearized files including the creation of QDF files (see `QDF
+Mode <#ref.qdf>`__.
+
+This outline was written prior to implementation and is not exactly
+accurate, but it provides a correct "notional" idea of how writing
+works. Look at the code in ``QPDFWriter`` for exact details.
+
+- Initialize state:
+
+ - next object number = 1
+
+ - object queue = empty
+
+ - renumber table: old object id/generation to new id/0 = empty
+
+ - xref table: new id -> offset = empty
+
+- Create a QPDF object from a file.
+
+- Write header for new PDF file.
+
+- Request the trailer dictionary.
+
+- For each value that is an indirect object, grab the next object
+ number (via an operation that returns and increments the number). Map
+ object to new number in renumber table. Push object onto queue.
+
+- While there are more objects on the queue:
+
+ - Pop queue.
+
+ - Look up object's new number *n* in the renumbering table.
+
+ - Store current offset into xref table.
+
+ - Write ``@1@replaceable@1@n@2@replaceable@2@ 0 obj``.
+
+ - If object is null, whether direct or indirect, write out null,
+ thus eliminating unresolvable indirect object references.
+
+ - If the object is a stream stream, write stream contents, piped
+ through any filters as required, to a memory buffer. Use this
+ buffer to determine the stream length.
+
+ - If object is not a stream, array, or dictionary, write out its
+ contents.
+
+ - If object is an array or dictionary (including stream), traverse
+ its elements (for array) or values (for dictionaries), handling
+ recursive dictionaries and arrays, looking for indirect objects.
+ When an indirect object is found, if it is not resolvable, ignore.
+ (This case is handled when writing it out.) Otherwise, look it up
+ in the renumbering table. If not found, grab the next available
+ object number, assign to the referenced object in the renumbering
+ table, and push the referenced object onto the queue. As a special
+ case, when writing out a stream dictionary, replace length,
+ filters, and decode parameters as required.
+
+ Write out dictionary or array, replacing any unresolvable indirect
+ object references with null (pdf spec says reference to
+ non-existent object is legal and resolves to null) and any
+ resolvable ones with references to the renumbered objects.
+
+ - If the object is a stream, write ``stream\n``, the stream contents
+ (from the memory buffer), and ``\nendstream\n``.
+
+ - When done, write ``endobj``.
+
+Once we have finished the queue, all referenced objects will have been
+written out and all deleted objects or unreferenced objects will have
+been skipped. The new cross-reference table will contain an offset for
+every new object number from 1 up to the number of objects written. This
+can be used to write out a new xref table. Finally we can write out the
+trailer dictionary with appropriately computed /ID (see spec, 8.3, File
+Identifiers), the cross reference table offset, and ``%%EOF``.
+
+.. _ref.filtered-streams:
+
+Filtered Streams
+----------------
+
+Support for streams is implemented through the ``Pipeline`` interface
+which was designed for this package.
+
+When reading streams, create a series of ``Pipeline`` objects. The
+``Pipeline`` abstract base requires implementation ``write()`` and
+``finish()`` and provides an implementation of ``getNext()``. Each
+pipeline object, upon receiving data, does whatever it is going to do
+and then writes the data (possibly modified) to its successor.
+Alternatively, a pipeline may be an end-of-the-line pipeline that does
+something like store its output to a file or a memory buffer ignoring a
+successor. For additional details, look at
+@1@filename@1@Pipeline.hh@2@filename@2@.
+
+``QPDF`` can read raw or filtered streams. When reading a filtered
+stream, the ``QPDF`` class creates a ``Pipeline`` object for one of each
+appropriate filter object and chains them together. The last filter
+should write to whatever type of output is required. The ``QPDF`` class
+has an interface to write raw or filtered stream contents to a given
+pipeline.
+
+.. _ref.object-accessors:
+
+Object Accessor Methods
+-----------------------
+
+@1@comment: This section is referenced in QPDFObjectHandle.hh @1@
+
+For general information about how to access instances of
+``QPDFObjectHandle``, please see the comments in
+@1@filename@1@QPDFObjectHandle.hh@2@filename@2@. Search for "Accessor
+methods". This section provides a more in-depth discussion of the
+behavior and the rationale for the behavior.
+
+*Why were type errors made into warnings?* When type checks were
+introduced into qpdf in the early days, it was expected that type errors
+would only occur as a result of programmer error. However, in practice,
+type errors would occur with malformed PDF files because of assumptions
+made in code, including code within the qpdf library and code written by
+library users. The most common case would be chaining calls to
+``getKey()`` to access keys deep within a dictionary. In many cases,
+qpdf would be able to recover from these situations, but the old
+behavior often resulted in crashes rather than graceful recovery. For
+this reason, the errors were changed to warnings.
+
+*Why even warn about type errors when the user can't usually do anything
+about them?* Type warnings are extremely valuable during development.
+Since it's impossible to catch at compile time things like typos in
+dictionary key names or logic errors around what the structure of a PDF
+file might be, the presence of type warnings can save lots of developer
+time. They have also proven useful in exposing issues in qpdf itself
+that would have otherwise gone undetected.
+
+*Can there be a type-safe ``QPDFObjectHandle``?* It would be great if
+``QPDFObjectHandle`` could be more strongly typed so that you'd have to
+have check that something was of a particular type before calling
+type-specific accessor methods. However, implementing this at this stage
+of the library's history would be quite difficult, and it would make a
+the common pattern of drilling into an object no longer work. While it
+would be possible to have a parallel interface, it would create a lot of
+extra code. If qpdf were written in a language like rust, an interface
+like this would make a lot of sense, but, for a variety of reasons, the
+qpdf API is consistent with other APIs of its time, relying on exception
+handling to catch errors. The underlying PDF objects are inherently not
+type-safe. Forcing stronger type safety in ``QPDFObjectHandle`` would
+ultimately cause a lot more code to have to be written and would like
+make software that uses qpdf more brittle, and even so, checks would
+have to occur at runtime.
+
+*Why do type errors sometimes raise exceptions?* The way warnings work
+in qpdf requires a ``QPDF`` object to be associated with an object
+handle for a warning to be issued. It would be nice if this could be
+fixed, but it would require major changes to the API. Rather than
+throwing away these conditions, we convert them to exceptions. It's not
+that bad though. Since any object handle that was read from a file has
+an associated ``QPDF`` object, it would only be type errors on objects
+that were created explicitly that would cause exceptions, and in that
+case, type errors are much more likely to be the result of a coding
+error than invalid input.
+
+*Why does the behavior of a type exception differ between the C and C++
+API?* There is no way to throw and catch exceptions in C short of
+something like ``setjmp`` and ``longjmp``, and that approach is not
+portable across language barriers. Since the C API is often used from
+other languages, it's important to keep things as simple as possible.
+Starting in qpdf 10.5, exceptions that used to crash code using the C
+API will be written to stderr by default, and it is possible to register
+an error handler. There's no reason that the error handler can't
+simulate exception handling in some way, such as by using ``setjmp`` and
+``longjmp`` or by setting some variable that can be checked after
+library calls are made. In retrospect, it might have been better if the
+C API object handle methods returned error codes like the other methods
+and set return values in passed-in pointers, but this would complicate
+both the implementation and the use of the library for a case that is
+actually quite rare and largely avoidable.
+
+.. _ref.linearization:
+
+Linearization
+=============
+
+This chapter describes how ``QPDF`` and ``QPDFWriter`` implement
+creation and processing of linearized PDFS.
+
+.. _ref.linearization-strategy:
+
+Basic Strategy for Linearization
+--------------------------------
+
+To avoid the incestuous problem of having the qpdf library validate its
+own linearized files, we have a special linearized file checking mode
+which can be invoked via @1@command@1@qpdf
+--check-linearization@2@command@2@ (or @1@command@1@qpdf
+--check@2@command@2@). This mode reads the linearization parameter
+dictionary and the hint streams and validates that object ordering,
+parameters, and hint stream contents are correct. The validation code
+was first tested against linearized files created by external tools
+(Acrobat and pdlin) and then used to validate files created by
+``QPDFWriter`` itself.
+
+.. _ref.linearized.preparation:
+
+Preparing For Linearization
+---------------------------
+
+Before creating a linearized PDF file from any other PDF file, the PDF
+file must be altered such that all page attributes are propagated down
+to the page level (and not inherited from parents in the ``/Pages``
+tree). We also have to know which objects refer to which other objects,
+being concerned with page boundaries and a few other cases. We refer to
+this part of preparing the PDF file as
+@1@firstterm@1@optimization@2@firstterm@2@, discussed in
+`Optimization <#ref.optimization>`__. Note the, in this context, the
+term @1@firstterm@1@optimization@2@firstterm@2@ is a qpdf term, and the
+term @1@firstterm@1@linearization@2@firstterm@2@ is a term from the PDF
+specification. Do not be confused by the fact that many applications
+refer to linearization as optimization or web optimization.
+
+When creating linearized PDF files from optimized PDF files, there are
+really only a few issues that need to be dealt with:
+
+- Creation of hints tables
+
+- Placing objects in the correct order
+
+- Filling in offsets and byte sizes
+
+.. _ref.optimization:
+
+Optimization
+------------
+
+In order to perform various operations such as linearization and
+splitting files into pages, it is necessary to know which objects are
+referenced by which pages, page thumbnails, and root and trailer
+dictionary keys. It is also necessary to ensure that all page-level
+attributes appear directly at the page level and are not inherited from
+parents in the pages tree.
+
+We refer to the process of enforcing these constraints as
+@1@firstterm@1@optimization@2@firstterm@2@. As mentioned above, note
+that some applications refer to linearization as optimization. Although
+this optimization was initially motivated by the need to create
+linearized files, we are using these terms separately.
+
+PDF file optimization is implemented in the
+@1@filename@1@QPDF_optimization.cc@2@filename@2@ source file. That file
+is richly commented and serves as the primary reference for the
+optimization process.
+
+After optimization has been completed, the private member variables
+``obj_user_to_objects`` and ``object_to_obj_users`` in ``QPDF`` have
+been populated. Any object that has more than one value in the
+``object_to_obj_users`` table is shared. Any object that has exactly one
+value in the ``object_to_obj_users`` table is private. To find all the
+private objects in a page or a trailer or root dictionary key, one
+merely has make this determination for each element in the
+``obj_user_to_objects`` table for the given page or key.
+
+Note that pages and thumbnails have different object user types, so the
+above test on a page will not include objects referenced by the page's
+thumbnail dictionary and nothing else.
+
+.. _ref.linearization.writing:
+
+Writing Linearized Files
+------------------------
+
+We will create files with only primary hint streams. We will never write
+overflow hint streams. (As of PDF version 1.4, Acrobat doesn't either,
+and they are never necessary.) The hint streams contain offset
+information to objects that point to where they would be if the hint
+stream were not present. This means that we have to calculate all object
+positions before we can generate and write the hint table. This means
+that we have to generate the file in two passes. To make this reliable,
+``QPDFWriter`` in linearization mode invokes exactly the same code twice
+to write the file to a pipeline.
+
+In the first pass, the target pipeline is a count pipeline chained to a
+discard pipeline. The count pipeline simply passes its data through to
+the next pipeline in the chain but can return the number of bytes passed
+through it at any intermediate point. The discard pipeline is an end of
+line pipeline that just throws its data away. The hint stream is not
+written and dummy values with adequate padding are stored in the first
+cross reference table, linearization parameter dictionary, and /Prev key
+of the first trailer dictionary. All the offset, length, object
+renumbering information, and anything else we need for the second pass
+is stored.
+
+At the end of the first pass, this information is passed to the ``QPDF``
+class which constructs a compressed hint stream in a memory buffer and
+returns it. ``QPDFWriter`` uses this information to write a complete
+hint stream object into a memory buffer. At this point, the length of
+the hint stream is known.
+
+In the second pass, the end of the pipeline chain is a regular file
+instead of a discard pipeline, and we have known values for all the
+offsets and lengths that we didn't have in the first pass. We have to
+adjust offsets that appear after the start of the hint stream by the
+length of the hint stream, which is known. Anything that is of variable
+length is padded, with the padding code surrounding any writing code
+that differs in the two passes. This ensures that changes to the way
+things are represented never results in offsets that were gathered
+during the first pass becoming incorrect for the second pass.
+
+Using this strategy, we can write linearized files to a non-seekable
+output stream with only a single pass to disk or wherever the output is
+going.
+
+.. _ref.linearization-data:
+
+Calculating Linearization Data
+------------------------------
+
+Once a file is optimized, we have information about which objects access
+which other objects. We can then process these tables to decide which
+part (as described in "Linearized PDF Document Structure" in the PDF
+specification) each object is contained within. This tells us the exact
+order in which objects are written. The ``QPDFWriter`` class asks for
+this information and enqueues objects for writing in the proper order.
+It also turns on a check that causes an exception to be thrown if an
+object is encountered that has not already been queued. (This could
+happen only if there were a bug in the traversal code used to calculate
+the linearization data.)
+
+.. _ref.linearization-issues:
+
+Known Issues with Linearization
+-------------------------------
+
+There are a handful of known issues with this linearization code. These
+issues do not appear to impact the behavior of linearized files which
+still work as intended: it is possible for a web browser to begin to
+display them before they are fully downloaded. In fact, it seems that
+various other programs that create linearized files have many of these
+same issues. These items make reference to terminology used in the
+linearization appendix of the PDF specification.
+
+- Thread Dictionary information keys appear in part 4 with the rest of
+ Threads instead of in part 9. Objects in part 9 are not grouped
+ together functionally.
+
+- We are not calculating numerators for shared object positions within
+ content streams or interleaving them within content streams.
+
+- We generate only page offset, shared object, and outline hint tables.
+ It would be relatively easy to add some additional tables. We gather
+ most of the information needed to create thumbnail hint tables. There
+ are comments in the code about this.
+
+.. _ref.linearization-debugging:
+
+Debugging Note
+--------------
+
+The @1@command@1@qpdf --show-linearization@2@command@2@ command can show
+the complete contents of linearization hint streams. To look at the raw
+data, you can extract the filtered contents of the linearization hint
+tables using @1@command@1@qpdf --show-object=n
+--filtered-stream-data@2@command@2@. Then, to convert this into a bit
+stream (since linearization tables are bit streams written without
+regard to byte boundaries), you can pipe the resulting data through the
+following perl code:
+
+::
+
+ use bytes;
+ binmode STDIN;
+ undef $/;
+ my $a = <STDIN>;
+ my @ch = split(//, $a);
+ map { printf("%08b", ord($_)) } @ch;
+ print "\n";
+
+.. _ref.object-and-xref-streams:
+
+Object and Cross-Reference Streams
+==================================
+
+This chapter provides information about the implementation of object
+stream and cross-reference stream support in qpdf.
+
+.. _ref.object-streams:
+
+Object Streams
+--------------
+
+Object streams can contain any regular object except the following:
+
+- stream objects
+
+- objects with generation > 0
+
+- the encryption dictionary
+
+- objects containing the /Length of another stream
+
+In addition, Adobe reader (at least as of version 8.0.0) appears to not
+be able to handle having the document catalog appear in an object stream
+if the file is encrypted, though this is not specifically disallowed by
+the specification.
+
+There are additional restrictions for linearized files. See
+`Implications for Linearized
+Files <#ref.object-streams-linearization>`__\ for details.
+
+The PDF specification refers to objects in object streams as "compressed
+objects" regardless of whether the object stream is compressed.
+
+The generation number of every object in an object stream must be zero.
+It is possible to delete and replace an object in an object stream with
+a regular object.
+
+The object stream dictionary has the following keys:
+
+- ``/N``: number of objects
+
+- ``/First``: byte offset of first object
+
+- ``/Extends``: indirect reference to stream that this extends
+
+Stream collections are formed with ``/Extends``. They must form a
+directed acyclic graph. These can be used for semantic information and
+are not meaningful to the PDF document's syntactic structure. Although
+qpdf preserves stream collections, it never generates them and doesn't
+make use of this information in any way.
+
+The specification recommends limiting the number of objects in object
+stream for efficiency in reading and decoding. Acrobat 6 uses no more
+than 100 objects per object stream for linearized files and no more 200
+objects per stream for non-linearized files. ``QPDFWriter``, in object
+stream generation mode, never puts more than 100 objects in an object
+stream.
+
+Object stream contents consists of *N* pairs of integers, each of which
+is the object number and the byte offset of the object relative to the
+first object in the stream, followed by the objects themselves,
+concatenated.
+
+.. _ref.xref-streams:
+
+Cross-Reference Streams
+-----------------------
+
+For non-hybrid files, the value following ``startxref`` is the byte
+offset to the xref stream rather than the word ``xref``.
+
+For hybrid files (files containing both xref tables and cross-reference
+streams), the xref table's trailer dictionary contains the key
+``/XRefStm`` whose value is the byte offset to a cross-reference stream
+that supplements the xref table. A PDF 1.5-compliant application should
+read the xref table first. Then it should replace any object that it has
+already seen with any defined in the xref stream. Then it should follow
+any ``/Prev`` pointer in the original xref table's trailer dictionary.
+The specification is not clear about what should be done, if anything,
+with a ``/Prev`` pointer in the xref stream referenced by an xref table.
+The ``QPDF`` class ignores it, which is probably reasonable since, if
+this case were to appear for any sensible PDF file, the previous xref
+table would probably have a corresponding ``/XRefStm`` pointer of its
+own. For example, if a hybrid file were appended, the appended section
+would have its own xref table and ``/XRefStm``. The appended xref table
+would point to the previous xref table which would point the
+``/XRefStm``, meaning that the new ``/XRefStm`` doesn't have to point to
+it.
+
+Since xref streams must be read very early, they may not be encrypted,
+and the may not contain indirect objects for keys required to read them,
+which are these:
+
+- ``/Type``: value ``/XRef``
+
+- ``/Size``: value *n+1*: where *n* is highest object number (same as
+ ``/Size`` in the trailer dictionary)
+
+- ``/Index`` (optional): value
+ ``[@1@replaceable@1@n count@2@replaceable@2@ ...]`` used to determine
+ which objects' information is stored in this stream. The default is
+ ``[0 /Size]``.
+
+- ``/Prev``: value @1@replaceable@1@offset@2@replaceable@2@: byte
+ offset of previous xref stream (same as ``/Prev`` in the trailer
+ dictionary)
+
+- ``/W [...]``: sizes of each field in the xref table
+
+The other fields in the xref stream, which may be indirect if desired,
+are the union of those from the xref table's trailer dictionary.
+
+.. _ref.xref-stream-data:
+
+Cross-Reference Stream Data
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The stream data is binary and encoded in big-endian byte order. Entries
+are concatenated, and each entry has a length equal to the total of the
+entries in ``/W`` above. Each entry consists of one or more fields, the
+first of which is the type of the field. The number of bytes for each
+field is given by ``/W`` above. A 0 in ``/W`` indicates that the field
+is omitted and has the default value. The default value for the field
+type is "``1``". All other default values are "``0``".
+
+PDF 1.5 has three field types:
+
+- 0: for free objects. Format: ``0 obj
+ next-generation``, same as the free table in a traditional
+ cross-reference table
+
+- 1: regular non-compressed object. Format: ``1 offset
+ generation``
+
+- 2: for objects in object streams. Format: ``2
+ object-stream-number index``, the number of object stream
+ containing the object and the index within the object stream of the
+ object.
+
+It seems standard to have the first entry in the table be ``0 0 0``
+instead of ``0 0 ffff`` if there are no deleted objects.
+
+.. _ref.object-streams-linearization:
+
+Implications for Linearized Files
+---------------------------------
+
+For linearized files, the linearization dictionary, document catalog,
+and page objects may not be contained in object streams.
+
+Objects stored within object streams are given the highest range of
+object numbers within the main and first-page cross-reference sections.
+
+It is okay to use cross-reference streams in place of regular xref
+tables. There are on special considerations.
+
+Hint data refers to object streams themselves, not the objects in the
+streams. Shared object references should also be made to the object
+streams. There are no reference in any hint tables to the object numbers
+of compressed objects (objects within object streams).
+
+When numbering objects, all shared objects within both the first and
+second halves of the linearized files must be numbered consecutively
+after all normal uncompressed objects in that half.
+
+.. _ref.object-stream-implementation:
+
+Implementation Notes
+--------------------
+
+There are three modes for writing object streams:
+@1@option@1@disable@2@option@2@, @1@option@1@preserve@2@option@2@, and
+@1@option@1@generate@2@option@2@. In disable mode, we do not generate
+any object streams, and we also generate an xref table rather than xref
+streams. This can be used to generate PDF files that are viewable with
+older readers. In preserve mode, we write object streams such that
+written object streams contain the same objects and ``/Extends``
+relationships as in the original file. This is equal to disable if the
+file has no object streams. In generate, we create object streams
+ourselves by grouping objects that are allowed in object streams
+together in sets of no more than 100 objects. We also ensure that the
+PDF version is at least 1.5 in generate mode, but we preserve the
+version header in the other modes. The default is
+@1@option@1@preserve@2@option@2@.
+
+We do not support creation of hybrid files. When we write files, even in
+preserve mode, we will lose any xref tables and merge any appended
+sections.
+
+.. _ref.release-notes:
+
+Release Notes
+=============
+
+For a detailed list of changes, please see the file
+@1@filename@1@ChangeLog@2@filename@2@ in the source distribution.
+
+10.5.0: XXX Month dd, YYYY
+ - Library Enhancements
+
+ - Since qpdf version 8, using object accessor methods on an
+ instance of ``QPDFObjectHandle`` may create warnings if the
+ object is not of the expected type. These warnings now have an
+ error code of ``qpdf_e_object`` instead of
+ ``qpdf_e_damaged_pdf``. Also, comments have been added to
+ @1@filename@1@QPDFObjectHandle.hh@2@filename@2@ to explain in
+ more detail what the behavior is. See `Object Accessor
+ Methods <#ref.object-accessors>`__ for a more in-depth
+ discussion.
+
+ - Overhaul error handling for the object handle functions in the
+ C API. See comments in the "Object handling" section of
+ @1@filename@1@include/qpdf/qpdf-c.h@2@filename@2@ for details.
+ In particular, exceptions thrown by the underlying C++ code
+ when calling object accessors are caught and converted into
+ errors. The errors can be trapped by registering an error
+ handler with ``qpdf_register_oh_error_handler`` or will be
+ written to stderr if no handler is registered.
+
+ - Add ``qpdf_get_last_string_length`` to the C API to get the
+ length of the last string that was returned. This is needed to
+ handle strings that contain embedded null characters.
+
+ - Add ``qpdf_oh_is_initialized`` and
+ ``qpdf_oh_new_uninitialized`` to the C API to make it possible
+ to work with uninitialized objects.
+
+ - Add ``qpdf_oh_new_object`` to the C API. This allows you to
+ clone an object handle.
+
+ - Add ``qpdf_get_object_by_id``, ``qpdf_make_indirect_object``,
+ and ``qpdf_replace_object``, exposing the corresponding methods
+ in ``QPDF`` and ``QPDFObjectHandle``.
+
+10.4.0: November 16, 2021
+ - Handling of Weak Cryptography Algorithms
+
+ - From the qpdf CLI, the
+ @1@option@1@--allow-weak-crypto@2@option@2@ is now required to
+ suppress a warning when explicitly creating PDF files using RC4
+ encryption. While qpdf will always retain the ability to read
+ and write such files, doing so will require explicit
+ acknowledgment moving forward. For qpdf 10.4, this change only
+ affects the command-line tool. Starting in qpdf 11, there will
+ be small API changes to require explicit acknowledgment in
+ those cases as well. For additional information, see `Weak
+ Cryptography <#ref.weak-crypto>`__.
+
+ - Bug Fixes
+
+ - Fix potential bounds error when handling shell completion that
+ could occur when given bogus input.
+
+ - Properly handle overlay/underlay on completely empty pages
+ (with no resource dictionary).
+
+ - Fix crash that could occur under certain conditions when using
+ @1@option@1@--pages@2@option@2@ with files that had form
+ fields.
+
+ - Library Enhancements
+
+ - Make ``QPDF::findPage`` functions public.
+
+ - Add methods to ``Pl_Flate`` to be able to receive warnings on
+ certain recoverable conditions.
+
+ - Add an extra check to the library to detect when foreign
+ objects are inserted directly (instead of using
+ ``QPDF::copyForeignObject``) at the time of insertion rather
+ than when the file is written. Catching the error sooner makes
+ it much easier to locate the incorrect code.
+
+ - CLI Enhancements
+
+ - Improve diagnostics around parsing
+ @1@option@1@--pages@2@option@2@ command-line options
+
+ - Packaging Changes
+
+ - The Windows binary distribution is now built with crypto
+ provided by OpenSSL 3.0.
+
+10.3.2: May 8, 2021
+ - Bug Fixes
+
+ - When generating a file while preserving object streams,
+ unreferenced objects are correctly removed unless
+ @1@option@1@--preserve-unreferenced@2@option@2@ is specified.
+
+ - Library Enhancements
+
+ - When adding a page that already exists, make a shallow copy
+ instead of throwing an exception. This makes the library
+ behavior consistent with the CLI behavior. See
+ @1@filename@1@ChangeLog@2@filename@2@ for additional notes.
+
+10.3.1: March 11, 2021
+ - Bug Fixes
+
+ - Form field copying failed on files where /DR was a direct
+ object in the document-level form dictionary.
+
+10.3.0: March 4, 2021
+ - Bug Fixes
+
+ - The code for handling form fields when copying pages from
+ 10.2.0 was not quite right and didn't work in a number of
+ situations, such as when the same page was copied multiple
+ times or when there were conflicting resource or field names
+ across multiple copies. The 10.3.0 code has been much more
+ thoroughly tested with more complex cases and with a multitude
+ of readers and should be much closer to correct. The 10.2.0
+ code worked well enough for page splitting or for copying pages
+ with form fields into documents that didn't already have them
+ but was still not quite correct in handling of field-level
+ resources.
+
+ - When ``QPDF::replaceObject`` or ``QPDF::swapObjects`` is
+ called, existing ``QPDFObjectHandle`` instances no longer point
+ to the old objects. The next time they are accessed, they
+ automatically notice the change to the underlying object and
+ update themselves. This resolves a very longstanding source of
+ confusion, albeit in a very rarely used method call.
+
+ - Fix form field handling code to look for default appearances,
+ quadding, and default resources in the right places. The code
+ was not looking for things in the document-level interactive
+ form dictionary that it was supposed to be finding there. This
+ required adding a few new methods to
+ ``QPDFFormFieldObjectHelper``.
+
+ - Library Enhancements
+
+ - Reworked the code that handles copying annotations and form
+ fields during page operations. There were additional methods
+ added to the public API from 10.2.0 and a one deprecation of a
+ method added in 10.2.0. The majority of the API changes are in
+ methods most people would never call and that will hopefully be
+ superseded by higher-level interfaces for handling page copies.
+ Please see the @1@filename@1@ChangeLog@2@filename@2@ file for
+ details.
+
+ - The method ``QPDF::numWarnings`` was added so that you can tell
+ whether any warnings happened during a specific block of code.
+
+10.2.0: February 23, 2021
+ - CLI Behavior Changes
+
+ - Operations that work on combining pages are much better about
+ protecting form fields. In particular,
+ @1@option@1@--split-pages@2@option@2@ and
+ @1@option@1@--pages@2@option@2@ now preserve interaction form
+ functionality by copying the relevant form field information
+ from the original files. Additionally, if you use
+ @1@option@1@--pages@2@option@2@ to select only some pages from
+ the original input file, unused form fields are removed, which
+ prevents lots of unused annotations from being retained.
+
+ - By default, @1@command@1@qpdf@2@command@2@ no longer allows
+ creation of encrypted PDF files whose user password is
+ non-empty and owner password is empty when a 256-bit key is in
+ use. The @1@option@1@--allow-insecure@2@option@2@ option,
+ specified inside the @1@option@1@--encrypt@2@option@2@ options,
+ allows creation of such files. Behavior changes in the CLI are
+ avoided when possible, but an exception was made here because
+ this is security-related. qpdf must always allow creation of
+ weird files for testing purposes, but it should not default to
+ letting users unknowingly create insecure files.
+
+ - Library Behavior Changes
+
+ - Note: the changes in this section cause differences in output
+ in some cases. These differences change the syntax of the PDF
+ but do not change the semantics (meaning). I make a strong
+ effort to avoid gratuitous changes in qpdf's output so that
+ qpdf changes don't break people's tests. In this case, the
+ changes significantly improve the readability of the generated
+ PDF and don't affect any output that's generated by simple
+ transformation. If you are annoyed by having to update test
+ files, please rest assured that changes like this have been and
+ will continue to be rare events.
+
+ - ``QPDFObjectHandle::newUnicodeString`` now uses whichever of
+ ASCII, PDFDocEncoding, of UTF-16 is sufficient to encode all
+ the characters in the string. This reduces needless encoding in
+ UTF-16 of strings that can be encoded in ASCII. This change may
+ cause qpdf to generate different output than before when form
+ field values are set using ``QPDFFormFieldObjectHelper`` but
+ does not change the meaning of the output.
+
+ - The code that places form XObjects and also the code that
+ flattens rotations trim trailing zeroes from real numbers that
+ they calculate. This causes slight (but semantically
+ equivalent) differences in generated appearance streams and
+ form XObject invocations in overlay/underlay code or in user
+ code that calls the methods that place form XObjects on a page.
+
+ - CLI Enhancements
+
+ - Add new command line options for listing, saving, adding,
+ removing, and and copying file attachments. See `Embedded
+ Files/Attachments Options <#ref.attachments>`__ for details.
+
+ - Page splitting and merging operations, as well as
+ @1@option@1@--flatten-rotation@2@option@2@, are better behaved
+ with respect to annotations and interactive form fields. In
+ most cases, interactive form field functionality and proper
+ formatting and functionality of annotations is preserved by
+ these operations. There are still some cases that aren't
+ perfect, such as when functionality of annotations depends on
+ document-level data that qpdf doesn't yet understand or when
+ there are problems with referential integrity among form fields
+ and annotations (e.g., when a single form field object or its
+ associated annotations are shared across multiple pages, a case
+ that is out of spec but that works in most viewers anyway).
+
+ - The option
+ @1@option@1@--password-file=@1@replaceable@1@filename@2@replaceable@2@@2@option@2@
+ can now be used to read the decryption password from a file.
+ You can use ``-`` as the file name to read the password from
+ standard input. This is an easier/more obvious way to read
+ passwords from files or standard input than using
+ @1@option@1@@file@2@option@2@ for this purpose.
+
+ - Add some information about attachments to the json output, and
+ added ``attachments`` as an additional json key. The
+ information included here is limited to the preferred name and
+ content stream and a reference to the file spec object. This is
+ enough detail for clients to avoid the hassle of navigating a
+ name tree and provides what is needed for basic enumeration and
+ extraction of attachments. More detailed information can be
+ obtained by following the reference to the file spec object.
+
+ - Add numeric option to @1@option@1@--collate@2@option@2@. If
+ @1@option@1@--collate=@1@replaceable@1@n@2@replaceable@2@@2@option@2@
+ is given, take pages in groups of
+ @1@replaceable@1@n@2@replaceable@2@ from the given files.
+
+ - It is now valid to provide @1@option@1@--rotate=0@2@option@2@
+ to clear rotation from a page.
+
+ - Library Enhancements
+
+ - This release includes numerous additions to the API. Not all
+ changes are listed here. Please see the
+ @1@filename@1@ChangeLog@2@filename@2@ file in the source
+ distribution for a comprehensive list. Highlights appear below.
+
+ - Add ``QPDFObjectHandle::ditems()`` and
+ ``QPDFObjectHandle::aitems()`` that enable C++-style iteration,
+ including range-for iteration, over dictionary and array
+ QPDFObjectHandles. See comments in
+ @1@filename@1@include/qpdf/QPDFObjectHandle.hh@2@filename@2@
+ and
+ @1@filename@1@examples/pdf-name-number-tree.cc@2@filename@2@
+ for details.
+
+ - Add ``QPDFObjectHandle::copyStream`` for making a copy of a
+ stream within the same ``QPDF`` instance.
+
+ - Add new helper classes for supporting file attachments, also
+ known as embedded files. New classes are
+ ``QPDFEmbeddedFileDocumentHelper``,
+ ``QPDFFileSpecObjectHelper``, and ``QPDFEFStreamObjectHelper``.
+ See their respective headers for details and
+ @1@filename@1@examples/pdf-attach-file.cc@2@filename@2@ for an
+ example.
+
+ - Add a version of ``QPDFObjectHandle::parse`` that takes a
+ ``QPDF`` pointer as context so that it can parse strings
+ containing indirect object references. This is illustrated in
+ @1@filename@1@examples/pdf-attach-file.cc@2@filename@2@.
+
+ - Re-implement ``QPDFNameTreeObjectHelper`` and
+ ``QPDFNumberTreeObjectHelper`` to be more efficient, add an
+ iterator-based API, give them the capability to repair broken
+ trees, and create methods for modifying the trees. With this
+ change, qpdf has a robust read/write implementation of name and
+ number trees.
+
+ - Add new versions of ``QPDFObjectHandle::replaceStreamData``
+ that take ``std::function`` objects for cases when you need
+ something between a static string and a full-fledged
+ StreamDataProvider. Using this with ``QUtil::file_provider`` is
+ a very easy way to create a stream from the contents of a file.
+
+ - The ``QPDFMatrix`` class, formerly a private, internal class,
+ has been added to the public API. See
+ @1@filename@1@include/qpdf/QPDFMatrix.hh@2@filename@2@ for
+ details. This class is for working with transformation
+ matrices. Some methods in ``QPDFPageObjectHelper`` make use of
+ this to make information about transformation matrices
+ available. For an example, see
+ @1@filename@1@examples/pdf-overlay-page.cc@2@filename@2@.
+
+ - Several new methods were added to
+ ``QPDFAcroFormDocumentHelper`` for adding, removing, getting
+ information about, and enumerating form fields.
+
+ - Add method
+ ``QPDFAcroFormDocumentHelper::transformAnnotations``, which
+ applies a transformation to each annotation on a page.
+
+ - Add ``QPDFPageObjectHelper::copyAnnotations``, which copies
+ annotations and, if applicable, associated form fields, from
+ one page to another, possibly transforming the rectangles.
+
+ - Build Changes
+
+ - A C++-14 compiler is now required to build qpdf. There is no
+ intention to require anything newer than that for a while.
+ C++-14 includes modest enhancements to C++-11 and appears to be
+ supported about as widely as C++-11.
+
+ - Bug Fixes
+
+ - The @1@option@1@--flatten-rotation@2@option@2@ option applies
+ transformations to any annotations that may be on the page.
+
+ - If a form XObject lacks a resources dictionary, consider any
+ names in that form XObject to be referenced from the containing
+ page. This is compliant with older PDF versions. Also detect if
+ any form XObjects have any unresolved names and, if so, don't
+ remove unreferenced resources from them or from the page that
+ contains them. Unfortunately this has the side effect of
+ preventing removal of unreferenced resources in some cases
+ where names appear that don't refer to resources, such as with
+ tagged PDF. This is a bit of a corner case that is not likely
+ to cause a significant problem in practice, but the only side
+ effect would be lack of removal of shared resources. A future
+ version of qpdf may be more sophisticated in its detection of
+ names that refer to resources.
+
+ - Properly handle strings if they appear in inline image
+ dictionaries while externalizing inline images.
+
+10.1.0: January 5, 2021
+ - CLI Enhancements
+
+ - Add @1@option@1@--flatten-rotation@2@option@2@ command-line
+ option, which causes all pages that are rotated using
+ parameters in the page's dictionary to instead be identically
+ rotated in the page's contents. The change is not user-visible
+ for compliant PDF readers but can be used to work around broken
+ PDF applications that don't properly handle page rotation.
+
+ - Library Enhancements
+
+ - Support for user-provided (pluggable, modular) stream filters.
+ It is now possible to derive a class from ``QPDFStreamFilter``
+ and register it with ``QPDF`` so that regular library methods,
+ including those used by ``QPDFWriter``, can decode streams with
+ filters not directly supported by the library. The example
+ @1@filename@1@examples/pdf-custom-filter.cc@2@filename@2@
+ illustrates how to use this capability.
+
+ - Add methods to ``QPDFPageObjectHelper`` to iterate through
+ XObjects on a page or form XObjects, possibly recursing into
+ nested form XObjects: ``forEachXObject``, ``ForEachImage``,
+ ``forEachFormXObject``.
+
+ - Enhance several methods in ``QPDFPageObjectHelper`` to work
+ with form XObjects as well as pages, as noted in comments. See
+ @1@filename@1@ChangeLog@2@filename@2@ for a full list.
+
+ - Rename some functions in ``QPDFPageObjectHelper``, while
+ keeping old names for compatibility:
+
+ - ``getPageImages`` to ``getImages``
+
+ - ``filterPageContents`` to ``filterContents``
+
+ - ``pipePageContents`` to ``pipeContents``
+
+ - ``parsePageContents`` to ``parseContents``
+
+ - Add method ``QPDFPageObjectHelper::getFormXObjects`` to return
+ a map of form XObjects directly on a page or form XObject
+
+ - Add new helper methods to ``QPDFObjectHandle``:
+ ``isFormXObject``, ``isImage``
+
+ - Add the optional ``allow_streams`` parameter
+ ``QPDFObjectHandle::makeDirect``. When
+ ``QPDFObjectHandle::makeDirect`` is called in this way, it
+ preserves references to streams rather than throwing an
+ exception.
+
+ - Add ``QPDFObjectHandle::setFilterOnWrite`` method. Calling this
+ on a stream prevents ``QPDFWriter`` from attempting to
+ uncompress, recompress, or otherwise filter a stream even if it
+ could. Developers can use this to protect streams that are
+ optimized should be protected from ``QPDFWriter``'s default
+ behavior for any other reason.
+
+ - Add ``ostream`` ``<<`` operator for ``QPDFObjGen``. This is
+ useful to have for debugging.
+
+ - Add method ``QPDFPageObjectHelper::flattenRotation``, which
+ replaces a page's ``/Rotate`` keyword by rotating the page
+ within the content stream and altering the page's bounding
+ boxes so the rendering is the same. This can be used to work
+ around buggy PDF readers that can't properly handle page
+ rotation.
+
+ - C API Enhancements
+
+ - Add several new functions to the C API for working with
+ objects. These are wrappers around many of the methods in
+ ``QPDFObjectHandle``. Their inclusion adds considerable new
+ capability to the C API.
+
+ - Add ``qpdf_register_progress_reporter`` to the C API,
+ corresponding to ``QPDFWriter::registerProgressReporter``.
+
+ - Performance Enhancements
+
+ - Improve steps ``QPDFWriter`` takes to prepare a ``QPDF`` object
+ for writing, resulting in about an 8% improvement in write
+ performance while allowing indirect objects to appear in
+ ``/DecodeParms``.
+
+ - When extracting pages, the @1@command@1@qpdf@2@command@2@ CLI
+ only removes unreferenced resources from the pages that are
+ being kept, resulting in a significant performance improvement
+ when extracting small numbers of pages from large, complex
+ documents.
+
+ - Bug Fixes
+
+ - ``QPDFPageObjectHelper::externalizeInlineImages`` was not
+ externalizing images referenced from form XObjects that
+ appeared on the page.
+
+ - ``QPDFObjectHandle::filterPageContents`` was broken for pages
+ with multiple content streams.
+
+ - Tweak zsh completion code to behave a little better with
+ respect to path completion.
+
+10.0.4: November 21, 2020
+ - Bug Fixes
+
+ - Fix a handful of integer overflows. This includes cases found
+ by fuzzing as well as having qpdf not do range checking on
+ unused values in the xref stream.
+
+10.0.3: October 31, 2020
+ - Bug Fixes
+
+ - The fix to the bug involving copying streams with indirect
+ filters was incorrect and introduced a new, more serious bug.
+ The original bug has been fixed correctly, as has the bug
+ introduced in 10.0.2.
+
+10.0.2: October 27, 2020
+ - Bug Fixes
+
+ - When concatenating content streams, as with
+ @1@option@1@--coalesce-contents@2@option@2@, there were cases
+ in which qpdf would merge two lexical tokens together, creating
+ invalid results. A newline is now inserted between merged
+ content streams if one is not already present.
+
+ - Fix an internal error that could occur when copying foreign
+ streams whose stream data had been replaced using a stream data
+ provider if those streams had indirect filters or decode
+ parameters. This is a rare corner case.
+
+ - Ensure that the caller's locale settings do not change the
+ results of numeric conversions performed internally by the qpdf
+ library. Note that the problem here could only be caused when
+ the qpdf library was used programmatically. Using the qpdf CLI
+ already ignored the user's locale for numeric conversion.
+
+ - Fix several instances in which warnings were not suppressed in
+ spite of @1@option@1@--no-warn@2@option@2@ and/or errors or
+ warnings were written to standard output rather than standard
+ error.
+
+ - Fixed a memory leak that could occur under specific
+ circumstances when
+ @1@option@1@--object-streams=generate@2@option@2@ was used.
+
+ - Fix various integer overflows and similar conditions found by
+ the OSS-Fuzz project.
+
+ - Enhancements
+
+ - New option @1@option@1@--warning-exit-0@2@option@2@ causes qpdf
+ to exit with a status of ``0`` rather than ``3`` if there are
+ warnings but no errors. Combine with
+ @1@option@1@--no-warn@2@option@2@ to completely ignore
+ warnings.
+
+ - Performance improvements have been made to
+ ``QPDF::processMemoryFile``.
+
+ - The OpenSSL crypto provider produces more detailed error
+ messages.
+
+ - Build Changes
+
+ - The option @1@option@1@--disable-rpath@2@option@2@ is now
+ supported by qpdf's @1@command@1@./configure@2@command@2@
+ script. Some distributions' packaging standards recommended the
+ use of this option.
+
+ - Selection of a printf format string for ``long
+ long`` has been moved from ``ifdefs`` to an autoconf
+ test. If you are using your own build system, you will need to
+ provide a value for ``LL_FMT`` in
+ @1@filename@1@libqpdf/qpdf/qpdf-config.h@2@filename@2@, which
+ would typically be ``"%lld"`` or, for some Windows compilers,
+ ``"%I64d"``.
+
+ - Several improvements were made to build-time configuration of
+ the OpenSSL crypto provider.
+
+ - A nearly stand-alone Linux binary zip file is now included with
+ the qpdf release. This is built on an older (but supported)
+ Ubuntu LTS release, but would work on most reasonably recent
+ Linux distributions. It contains only the executables and
+ required shared libraries that would not be present on a
+ minimal system. It can be used for including qpdf in a minimal
+ environment, such as a docker container. The zip file is also
+ known to work as a layer in AWS Lambda.
+
+ - QPDF's automated build has been migrated from Azure Pipelines
+ to GitHub Actions.
+
+ - Windows-specific Changes
+
+ - The Windows executables distributed with qpdf releases now use
+ the OpenSSL crypto provider by default. The native crypto
+ provider is also compiled in and can be selected at runtime
+ with the ``QPDF_CRYPTO_PROVIDER`` environment variable.
+
+ - Improvements have been made to how a cryptographic provider is
+ obtained in the native Windows crypto implementation. However
+ mostly this is shadowed by OpenSSL being used by default.
+
+10.0.1: April 9, 2020
+ - Bug Fixes
+
+ - 10.0.0 introduced a bug in which calling
+ ``QPDFObjectHandle::getStreamData`` on a stream that can't be
+ filtered was returning the raw data instead of throwing an
+ exception. This is now fixed.
+
+ - Fix a bug that was preventing qpdf from linking with some
+ versions of clang on some platforms.
+
+ - Enhancements
+
+ - Improve the @1@filename@1@pdf-invert-images@2@filename@2@
+ example to avoid having to load all the images into RAM at the
+ same time.
+
+10.0.0: April 6, 2020
+ - Performance Enhancements
+
+ - The qpdf library and executable should run much faster in this
+ version than in the last several releases. Several internal
+ library optimizations have been made, and there has been
+ improved behavior on page splitting as well. This version of
+ qpdf should outperform any of the 8.x or 9.x versions.
+
+ - Incompatible API (source-level) Changes (minor)
+
+ - The ``QUtil::srandom`` method was removed. It didn't do
+ anything unless insecure random numbers were compiled in, and
+ they have been off by default for a long time. If you were
+ calling it, just remove the call since it wasn't doing anything
+ anyway.
+
+ - Build/Packaging Changes
+
+ - Add a ``openssl`` crypto provider, which is implemented with
+ OpenSSL and also works with BoringSSL. Thanks to Dean Scarff
+ for this contribution. If you maintain qpdf for a distribution,
+ pay special attention to make sure that you are including
+ support for the crypto providers you want. Package maintainers
+ will have to weigh the advantages of allowing users to pick a
+ crypto provider at runtime against the disadvantages of adding
+ more dependencies to qpdf.
+
+ - Allow qpdf to built on stripped down systems whose C/C++
+ libraries lack the ``wchar_t`` type. Search for ``wchar_t`` in
+ qpdf's README.md for details. This should be very rare, but it
+ is known to be helpful in some embedded environments.
+
+ - CLI Enhancements
+
+ - Add ``objectinfo`` key to the JSON output. This will be a place
+ to put computed metadata or other information about PDF objects
+ that are not immediately evident in other ways or that seem
+ useful for some other reason. In this version, information is
+ provided about each object indicating whether it is a stream
+ and, if so, what its length and filters are. Without this, it
+ was not possible to tell conclusively from the JSON output
+ alone whether or not an object was a stream. Run
+ @1@command@1@qpdf --json-help@2@command@2@ for details.
+
+ - Add new option
+ @1@option@1@--remove-unreferenced-resources@2@option@2@ which
+ takes ``auto``, ``yes``, or ``no`` as arguments. The new
+ ``auto`` mode, which is the default, performs a fast heuristic
+ over a PDF file when splitting pages to determine whether the
+ expensive process of finding and removing unreferenced
+ resources is likely to be of benefit. For most files, this new
+ default will result in a significant performance improvement
+ for splitting pages. See `Advanced Transformation
+ Options <#ref.advanced-transformation>`__ for a more detailed
+ discussion.
+
+ - The @1@option@1@--preserve-unreferenced-resources@2@option@2@
+ is now just a synonym for
+ @1@option@1@--remove-unreferenced-resources=no@2@option@2@.
+
+ - If the ``QPDF_EXECUTABLE`` environment variable is set when
+ invoking @1@command@1@qpdf --bash-completion@2@command@2@ or
+ @1@command@1@qpdf --zsh-completion@2@command@2@, the completion
+ command that it outputs will refer to qpdf using the value of
+ that variable rather than what @1@command@1@qpdf@2@command@2@
+ determines its executable path to be. This can be useful when
+ wrapping @1@command@1@qpdf@2@command@2@ with a script, working
+ with a version in the source tree, using an AppImage, or other
+ situations where there is some indirection.
+
+ - Library Enhancements
+
+ - Random number generation is now delegated to the crypto
+ provider. The old behavior is still used by the native crypto
+ provider. It is still possible to provide your own random
+ number generator.
+
+ - Add a new version of
+ ``QPDFObjectHandle::StreamDataProvider::provideStreamData``
+ that accepts the ``suppress_warnings`` and ``will_retry``
+ options and allows a success code to be returned. This makes it
+ possible to implement a ``StreamDataProvider`` that calls
+ ``pipeStreamData`` on another stream and to pass the response
+ back to the caller, which enables better error handling on
+ those proxied streams.
+
+ - Update ``QPDFObjectHandle::pipeStreamData`` to return an
+ overall success code that goes beyond whether or not filtered
+ data was written successfully. This allows better error
+ handling of cases that were not filtering errors. You have to
+ call this explicitly. Methods in previously existing APIs have
+ the same semantics as before.
+
+ - The ``QPDFPageObjectHelper::placeFormXObject`` method now
+ allows separate control over whether it should be willing to
+ shrink or expand objects to fit them better into the
+ destination rectangle. The previous behavior was that shrinking
+ was allowed but expansion was not. The previous behavior is
+ still the default.
+
+ - When calling the C API, any non-zero value passed to a boolean
+ parameter is treated as ``TRUE``. Previously only the value
+ ``1`` was accepted. This makes the C API behave more like most
+ C interfaces and is known to improve compatibility with some
+ Windows environments that dynamically load the DLL and call
+ functions from it.
+
+ - Add ``QPDFObjectHandle::unsafeShallowCopy`` for copying only
+ top-level dictionary keys or array items. This is unsafe
+ because it creates a situation in which changing a lower-level
+ item in one object may also change it in another object, but
+ for cases in which you *know* you are only inserting or
+ replacing top-level items, it is much faster than
+ ``QPDFObjectHandle::shallowCopy``.
+
+ - Add ``QPDFObjectHandle::filterAsContents``, which filter's a
+ stream's data as a content stream. This is useful for parsing
+ the contents for form XObjects in the same way as parsing page
+ content streams.
+
+ - Bug Fixes
+
+ - When detecting and removing unreferenced resources during page
+ splitting, traverse into form XObjects and handle their
+ resources dictionaries as well.
+
+ - The same error recovery is applied to streams in other than the
+ primary input file when merging or splitting pages.
+
+9.1.1: January 26, 2020
+ - Build/Packaging Changes
+
+ - The fix-qdf program was converted from perl to C++. As such,
+ qpdf no longer has a runtime dependency on perl.
+
+ - Library Enhancements
+
+ - Added new helper routine ``QUtil::call_main_from_wmain`` which
+ converts ``wchar_t`` arguments to UTF-8 encoded strings. This
+ is useful for qpdf because library methods expect file names to
+ be UTF-8 encoded, even on Windows
+
+ - Added new ``QUtil::read_lines_from_file`` methods that take
+ ``FILE*`` arguments and that allow preservation of end-of-line
+ characters. This also fixes a bug where
+ ``QUtil::read_lines_from_file`` wouldn't work properly with
+ Unicode filenames.
+
+ - CLI Enhancements
+
+ - Added options @1@option@1@--is-encrypted@2@option@2@ and
+ @1@option@1@--requires-password@2@option@2@ for testing whether
+ a file is encrypted or requires a password other than the
+ supplied (or empty) password. These communicate via exit
+ status, making them useful for shell scripts. They also work on
+ encrypted files with unknown passwords.
+
+ - Added ``encrypt`` key to JSON options. With the exception of
+ the reconstructed user password for older encryption formats,
+ this provides the same information as
+ @1@option@1@--show-encryption@2@option@2@ but in a consistent,
+ parseable format. See output of @1@command@1@qpdf
+ --json-help@2@command@2@ for details.
+
+ - Bug Fixes
+
+ - In QDF mode, be sure not to write more than one XRef stream to
+ a file, even when
+ @1@option@1@--preserve-unreferenced@2@option@2@ is used.
+ @1@command@1@fix-qdf@2@command@2@ assumes that there is only
+ one XRef stream, and that it appears at the end of the file.
+
+ - When externalizing inline images, properly handle images whose
+ color space is a reference to an object in the page's resource
+ dictionary.
+
+ - Windows-specific fix for acquiring crypt context with a new
+ keyset.
+
+9.1.0: November 17, 2019
+ - Build Changes
+
+ - A C++-11 compiler is now required to build qpdf.
+
+ - A new crypto provider that uses gnutls for crypto functions is
+ now available and can be enabled at build time. See `Crypto
+ Providers <#ref.crypto>`__ for more information about crypto
+ providers and `Build Support For Crypto
+ Providers <#ref.crypto.build>`__ for specific information about
+ the build.
+
+ - Library Enhancements
+
+ - Incorporate contribution from Masamichi Hosoda to properly
+ handle signature dictionaries by not including them in object
+ streams, formatting the ``Contents`` key has a hexadecimal
+ string, and excluding the ``/Contents`` key from encryption and
+ decryption.
+
+ - Incorporate contribution from Masamichi Hosoda to provide new
+ API calls for getting file-level information about input and
+ output files, enabling certain operations on the files at the
+ file level rather than the object level. New methods include
+ ``QPDF::getXRefTable()``,
+ ``QPDFObjectHandle::getParsedOffset()``,
+ ``QPDFWriter::getRenumberedObjGen(QPDFObjGen)``, and
+ ``QPDFWriter::getWrittenXRefTable()``.
+
+ - Support build-time and runtime selectable crypto providers.
+ This includes the addition of new classes
+ ``QPDFCryptoProvider`` and ``QPDFCryptoImpl`` and the
+ recognition of the ``QPDF_CRYPTO_PROVIDER`` environment
+ variable. Crypto providers are described in depth in `Crypto
+ Providers <#ref.crypto>`__.
+
+ - CLI Enhancements
+
+ - Addition of the @1@option@1@--show-crypto@2@option@2@ option in
+ support of selectable crypto providers, as described in `Crypto
+ Providers <#ref.crypto>`__.
+
+ - Allow ``:even`` or ``:odd`` to be appended to numeric ranges
+ for specification of the even or odd pages from among the pages
+ specified in the range.
+
+ - Fix shell wildcard expansion behavior (``*`` and ``?``) of the
+ @1@command@1@qpdf.exe@2@command@2@ as built my MSVC.
+
+9.0.2: October 12, 2019
+ - Bug Fix
+
+ - Fix the name of the temporary file used by
+ @1@option@1@--replace-input@2@option@2@ so that it doesn't
+ require path splitting and works with paths include
+ directories.
+
+9.0.1: September 20, 2019
+ - Bug Fixes/Enhancements
+
+ - Fix some build and test issues on big-endian systems and
+ compilers with characters that are unsigned by default. The
+ problems were in build and test only. There were no actual bugs
+ in the qpdf library itself relating to endianness or unsigned
+ characters.
+
+ - When a dictionary has a duplicated key, report this with a
+ warning. The behavior of the library in this case is unchanged,
+ but the error condition is no longer silently ignored.
+
+ - When a form field's display rectangle is erroneously specified
+ with inverted coordinates, detect and correct this situation.
+ This avoids some form fields from being flipped when flattening
+ annotations on files with this condition.
+
+9.0.0: August 31, 2019
+ - Incompatible API (source-level) Changes (minor)
+
+ - The method ``QUtil::strcasecmp`` has been renamed to
+ ``QUtil::str_compare_nocase``. This incompatible change is
+ necessary to enable qpdf to build on platforms that define
+ ``strcasecmp`` as a macro.
+
+ - The ``QPDF::copyForeignObject`` method had an overloaded
+ version that took a boolean parameter that was not used. If you
+ were using this version, just omit the extra parameter.
+
+ - There was a version ``QPDFTokenizer::expectInlineImage`` that
+ took no arguments. This version has been removed since it
+ caused the tokenizer to return incorrect inline images. A new
+ version was added some time ago that produces correct output.
+ This is a very low level method that doesn't make sense to call
+ outside of qpdf's lexical engine. There are higher level
+ methods for tokenizing content streams.
+
+ - Change ``QPDFOutlineDocumentHelper::getTopLevelOutlines`` and
+ ``QPDFOutlineObjectHelper::getKids`` to return a
+ ``std::vector`` instead of a ``std::list`` of
+ ``QPDFOutlineObjectHelper`` objects.
+
+ - Remove method ``QPDFTokenizer::allowPoundAnywhereInName``. This
+ function would allow creation of name tokens whose value would
+ change when unparsed, which is never the correct behavior.
+
+ - CLI Enhancements
+
+ - The @1@option@1@--replace-input@2@option@2@ option may be given
+ in place of an output file name. This causes qpdf to overwrite
+ the input file with the output. See the description of
+ @1@option@1@--replace-input@2@option@2@ in `Basic
+ Options <#ref.basic-options>`__ for more details.
+
+ - The @1@option@1@--recompress-flate@2@option@2@ instructs
+ @1@command@1@qpdf@2@command@2@ to recompress streams that are
+ already compressed with ``/FlateDecode``. Useful with
+ @1@option@1@--compression-level@2@option@2@.
+
+ - The
+ @1@option@1@--compression-level=@1@replaceable@1@level@2@replaceable@2@@2@option@2@
+ sets the zlib compression level used for any streams compressed
+ by ``/FlateDecode``. Most effective when combined with
+ @1@option@1@--recompress-flate@2@option@2@.
+
+ - Library Enhancements
+
+ - A new namespace ``QIntC``, provided by
+ @1@filename@1@qpdf/QIntC.hh@2@filename@2@, provides safe
+ conversion methods between different integer types. These
+ conversion methods do range checking to ensure that the cast
+ can be performed with no loss of information. Every use of
+ ``static_cast`` in the library was inspected to see if it could
+ use one of these safe converters instead. See `Casting
+ Policy <#ref.casting>`__ for additional details.
+
+ - Method ``QPDF::anyWarnings`` tells whether there have been any
+ warnings without clearing the list of warnings.
+
+ - Method ``QPDF::closeInputSource`` closes or otherwise releases
+ the input source. This enables the input file to be deleted or
+ renamed.
+
+ - New methods have been added to ``QUtil`` for converting back
+ and forth between strings and unsigned integers:
+ ``uint_to_string``, ``uint_to_string_base``,
+ ``string_to_uint``, and ``string_to_ull``.
+
+ - New methods have been added to ``QPDFObjectHandle`` that return
+ the value of ``Integer`` objects as ``int`` or ``unsigned int``
+ with range checking and sensible fallback values, and a new
+ method was added to return an unsigned value. This makes it
+ easier to write code that is safe from unintentional data loss.
+ Functions: ``getUIntValue``, ``getIntValueAsInt``,
+ ``getUIntValueAsUInt``.
+
+ - When parsing content streams with
+ ``QPDFObjectHandle::ParserCallbacks``, in place of the method
+ ``handleObject(QPDFObjectHandle)``, the developer may override
+ ``handleObject(QPDFObjectHandle, size_t offset,
+ size_t length)``. If this method is defined, it will
+ be invoked with the object along with its offset and length
+ within the overall contents being parsed. Intervening spaces
+ and comments are not included in offset and length.
+ Additionally, a new method ``contentSize(size_t)`` may be
+ implemented. If present, it will be called prior to the first
+ call to ``handleObject`` with the total size in bytes of the
+ combined contents.
+
+ - New methods ``QPDF::userPasswordMatched`` and
+ ``QPDF::ownerPasswordMatched`` have been added to enable a
+ caller to determine whether the supplied password was the user
+ password, the owner password, or both. This information is also
+ displayed by @1@command@1@qpdf --show-encryption@2@command@2@
+ and @1@command@1@qpdf --check@2@command@2@.
+
+ - Static method ``Pl_Flate::setCompressionLevel`` can be called
+ to set the zlib compression level globally used by all
+ instances of Pl_Flate in deflate mode.
+
+ - The method ``QPDFWriter::setRecompressFlate`` can be called to
+ tell ``QPDFWriter`` to uncompress and recompress streams
+ already compressed with ``/FlateDecode``.
+
+ - The underlying implementation of QPDF arrays has been enhanced
+ to be much more memory efficient when dealing with arrays with
+ lots of nulls. This enables qpdf to use drastically less memory
+ for certain types of files.
+
+ - When traversing the pages tree, if nodes are encountered with
+ invalid types, the types are fixed, and a warning is issued.
+
+ - A new helper method ``QUtil::read_file_into_memory`` was added.
+
+ - All conditions previously reported by
+ ``QPDF::checkLinearization()`` as errors are now presented as
+ warnings.
+
+ - Name tokens containing the ``#`` character not preceded by two
+ hexadecimal digits, which is invalid in PDF 1.2 and above, are
+ properly handled by the library: a warning is generated, and
+ the name token is properly preserved, even if invalid, in the
+ output. See @1@filename@1@ChangeLog@2@filename@2@ for a more
+ complete description of this change.
+
+ - Bug Fixes
+
+ - A small handful of memory issues, assertion failures, and
+ unhandled exceptions that could occur on badly mangled input
+ files have been fixed. Most of these problems were found by
+ Google's OSS-Fuzz project.
+
+ - When @1@command@1@qpdf --check@2@command@2@ or
+ @1@command@1@qpdf --check-linearization@2@command@2@ encounters
+ a file with linearization warnings but not errors, it now
+ properly exits with exit code 3 instead of 2.
+
+ - The @1@option@1@--completion-bash@2@option@2@ and
+ @1@option@1@--completion-zsh@2@option@2@ options now work
+ properly when qpdf is invoked as an AppImage.
+
+ - Calling ``QPDFWriter::set*EncryptionParameters`` on a
+ ``QPDFWriter`` object whose output filename has not yet been
+ set no longer produces a segmentation fault.
+
+ - When reading encrypted files, follow the spec more closely
+ regarding encryption key length. This allows qpdf to open
+ encrypted files in most cases when they have invalid or missing
+ /Length keys in the encryption dictionary.
+
+ - Build Changes
+
+ - On platforms that support it, qpdf now builds with
+ @1@option@1@-fvisibility=hidden@2@option@2@. If you build qpdf
+ with your own build system, this is now safe to use. This
+ prevents methods that are not part of the public API from being
+ exported by the shared library, and makes qpdf's ELF shared
+ libraries (used on Linux, MacOS, and most other UNIX flavors)
+ behave more like the Windows DLL. Since the DLL already behaves
+ in much this way, it is unlikely that there are any methods
+ that were accidentally not exported. However, with ELF shared
+ libraries, typeinfo for some classes has to be explicitly
+ exported. If there are problems in dynamically linked code
+ catching exceptions or subclassing, this could be the reason.
+ If you see this, please report a bug at
+ https://github.com/qpdf/qpdf/issues/.
+
+ - QPDF is now compiled with integer conversion and sign
+ conversion warnings enabled. Numerous changes were made to the
+ library to make this safe.
+
+ - QPDF's @1@command@1@make install@2@command@2@ target explicitly
+ specifies the mode to use when installing files instead of
+ relying the user's umask. It was previously doing this for some
+ files but not others.
+
+ - If @1@command@1@pkg-config@2@command@2@ is available, use it to
+ locate @1@filename@1@libjpeg@2@filename@2@ and
+ @1@filename@1@zlib@2@filename@2@ dependencies, falling back on
+ old behavior if unsuccessful.
+
+ - Other Notes
+
+ - QPDF has been fully integrated into `Google's OSS-Fuzz
+ project <https://github.com/google/oss-fuzz>`__. This project
+ exercises code with randomly mutated inputs and is great for
+ discovering hidden security crashes and security issues.
+ Several bugs found by oss-fuzz have already been fixed in qpdf.
+
+8.4.2: May 18, 2019
+ This release has just one change: correction of a buffer overrun in
+ the Windows code used to open files. Windows users should take this
+ update. There are no code changes that affect non-Windows releases.
+
+8.4.1: April 27, 2019
+ - Enhancements
+
+ - When @1@command@1@qpdf --version@2@command@2@ is run, it will
+ detect if the qpdf CLI was built with a different version of
+ qpdf than the library, which may indicate a problem with the
+ installation.
+
+ - New option @1@option@1@--remove-page-labels@2@option@2@ will
+ remove page labels before generating output. This used to
+ happen if you ran @1@command@1@qpdf --empty --pages ..
+ --@2@command@2@, but the behavior changed in qpdf 8.3.0. This
+ option enables people who were relying on the old behavior to
+ get it again.
+
+ - New option
+ @1@option@1@--keep-files-open-threshold=@1@replaceable@1@count@2@replaceable@2@@2@option@2@
+ can be used to override number of files that qpdf will use to
+ trigger the behavior of not keeping all files open when merging
+ files. This may be necessary if your system allows fewer than
+ the default value of 200 files to be open at the same time.
+
+ - Bug Fixes
+
+ - Handle Unicode characters in filenames on Windows. The changes
+ to support Unicode on the CLI in Windows broke Unicode
+ filenames for Windows.
+
+ - Slightly tighten logic that determines whether an object is a
+ page. This should resolve problems in some rare files where
+ some non-page objects were passing qpdf's test for whether
+ something was a page, thus causing them to be erroneously lost
+ during page splitting operations.
+
+ - Revert change that included preservation of outlines
+ (bookmarks) in @1@option@1@--split-pages@2@option@2@. The way
+ it was implemented in 8.3.0 and 8.4.0 caused a very significant
+ degradation of performance for splitting certain files. A
+ future release of qpdf may re-introduce the behavior in a more
+ performant and also more correct fashion.
+
+ - In JSON mode, add missing leading 0 to decimal values between
+ -1 and 1 even if not present in the input. The JSON
+ specification requires the leading 0. The PDF specification
+ does not.
+
+8.4.0: February 1, 2019
+ - Command-line Enhancements
+
+ - *Non-compatible CLI change:* The qpdf command-line tool
+ interprets passwords given at the command-line differently from
+ previous releases when the passwords contain non-ASCII
+ characters. In some cases, the behavior differs from previous
+ releases. For a discussion of the current behavior, please see
+ `Unicode Passwords <#ref.unicode-passwords>`__. The
+ incompatibilities are as follows:
+
+ - On Windows, qpdf now receives all command-line options as
+ Unicode strings if it can figure out the appropriate
+ compile/link options. This is enabled at least for MSVC and
+ mingw builds. That means that if non-ASCII strings are
+ passed to the qpdf CLI in Windows, qpdf will now correctly
+ receive them. In the past, they would have either been
+ encoded as Windows code page 1252 (also known as "Windows
+ ANSI" or as something unintelligible. In almost all cases,
+ qpdf is able to properly interpret Unicode arguments now,
+ whereas in the past, it would almost never interpret them
+ properly. The result is that non-ASCII passwords given to
+ the qpdf CLI on Windows now have a much greater chance of
+ creating PDF files that can be opened by a variety of
+ readers. In the past, usually files encrypted from the
+ Windows CLI using non-ASCII passwords would not be readable
+ by most viewers. Note that the current version of qpdf is
+ able to decrypt files that it previously created using the
+ previously supplied password.
+
+ - The PDF specification requires passwords to be encoded as
+ UTF-8 for 256-bit encryption and with PDF Doc encoding for
+ 40-bit or 128-bit encryption. Older versions of qpdf left it
+ up to the user to provide passwords with the correct
+ encoding. The qpdf CLI now detects when a password is given
+ with UTF-8 encoding and automatically transcodes it to what
+ the PDF spec requires. While this is almost always the
+ correct behavior, it is possible to override the behavior if
+ there is some reason to do so. This is discussed in more
+ depth in `Unicode Passwords <#ref.unicode-passwords>`__.
+
+ - New options
+ @1@option@1@--externalize-inline-images@2@option@2@,
+ @1@option@1@--ii-min-bytes@2@option@2@, and
+ @1@option@1@--keep-inline-images@2@option@2@ control qpdf's
+ handling of inline images and possible conversion of them to
+ regular images. By default,
+ @1@option@1@--optimize-images@2@option@2@ now also applies to
+ inline images. These options are discussed in `Advanced
+ Transformation Options <#ref.advanced-transformation>`__.
+
+ - Add options @1@option@1@--overlay@2@option@2@ and
+ @1@option@1@--underlay@2@option@2@ for overlaying or
+ underlaying pages of other files onto output pages. See
+ `Overlay and Underlay Options <#ref.overlay-underlay>`__ for
+ details.
+
+ - When opening an encrypted file with a password, if the
+ specified password doesn't work and the password contains any
+ non-ASCII characters, qpdf will try a number of alternative
+ passwords to try to compensate for possible character encoding
+ errors. This behavior can be suppressed with the
+ @1@option@1@--suppress-password-recovery@2@option@2@ option.
+ See `Unicode Passwords <#ref.unicode-passwords>`__ for a full
+ discussion.
+
+ - Add the @1@option@1@--password-mode@2@option@2@ option to
+ fine-tune how qpdf interprets password arguments, especially
+ when they contain non-ASCII characters. See `Unicode
+ Passwords <#ref.unicode-passwords>`__ for more information.
+
+ - In the @1@option@1@--pages@2@option@2@ option, it is now
+ possible to copy the same page more than once from the same
+ file without using the previous workaround of specifying two
+ different paths to the same file.
+
+ - In the @1@option@1@--pages@2@option@2@ option, allow use of "."
+ as a shortcut for the primary input file. That way, you can do
+ @1@command@1@qpdf in.pdf --pages . 1-2 -- out.pdf@2@command@2@
+ instead of having to repeat @1@filename@1@in.pdf@2@filename@2@
+ in the command.
+
+ - When encrypting with 128-bit and 256-bit encryption, new
+ encryption options @1@option@1@--assemble@2@option@2@,
+ @1@option@1@--annotate@2@option@2@,
+ @1@option@1@--form@2@option@2@, and
+ @1@option@1@--modify-other@2@option@2@ allow more fine-grained
+ granularity in configuring options. Before, the
+ @1@option@1@--modify@2@option@2@ option only configured certain
+ predefined groups of permissions.
+
+ - Bug Fixes and Enhancements
+
+ - *Potential data-loss bug:* Versions of qpdf between 8.1.0 and
+ 8.3.0 had a bug that could cause page splitting and merging
+ operations to drop some font or image resources if the PDF
+ file's internal structure shared these resource lists across
+ pages and if some but not all of the pages in the output did
+ not reference all the fonts and images. Using the
+ @1@option@1@--preserve-unreferenced-resources@2@option@2@
+ option would work around the incorrect behavior. This bug was
+ the result of a typo in the code and a deficiency in the test
+ suite. The case that triggered the error was known, just not
+ handled properly. This case is now exercised in qpdf's test
+ suite and properly handled.
+
+ - When optimizing images, detect and refuse to optimize images
+ that can't be converted to JPEG because of bit depth or color
+ space.
+
+ - Linearization and page manipulation APIs now detect and recover
+ from files that have duplicate Page objects in the pages tree.
+
+ - Using older option
+ @1@option@1@--stream-data=compress@2@option@2@ with object
+ streams, object streams and xref streams were not compressed.
+
+ - When the tokenizer returns inline image tokens, delimiters
+ following ``ID`` and ``EI`` operators are no longer excluded.
+ This makes it possible to reliably extract the actual image
+ data.
+
+ - Library Enhancements
+
+ - Add method ``QPDFPageObjectHelper::externalizeInlineImages`` to
+ convert inline images to regular images.
+
+ - Add method ``QUtil::possible_repaired_encodings()`` to generate
+ a list of strings that represent other ways the given string
+ could have been encoded. This is the method the QPDF CLI uses
+ to generate the strings it tries when recovering incorrectly
+ encoded Unicode passwords.
+
+ - Add new versions of
+ ``QPDFWriter::setR{3,4,5,6}EncryptionParameters`` that allow
+ more granular setting of permissions bits. See
+ @1@filename@1@QPDFWriter.hh@2@filename@2@ for details.
+
+ - Add new versions of the transcoders from UTF-8 to single-byte
+ coding systems in ``QUtil`` that report success or failure
+ rather than just substituting a specified unknown character.
+
+ - Add method ``QUtil::analyze_encoding()`` to determine whether a
+ string has high-bit characters and is appears to be UTF-16 or
+ valid UTF-8 encoding.
+
+ - Add new method ``QPDFPageObjectHelper::shallowCopyPage()`` to
+ copy a new page that is a "shallow copy" of a page. The
+ resulting object is an indirect object ready to be passed to
+ ``QPDFPageDocumentHelper::addPage()`` for either the original
+ ``QPDF`` object or a different one. This is what the
+ @1@command@1@qpdf@2@command@2@ command-line tool uses to copy
+ the same page multiple times from the same file during
+ splitting and merging operations.
+
+ - Add method ``QPDF::getUniqueId()``, which returns a unique
+ identifier for the given QPDF object. The identifier will be
+ unique across the life of the application. The returned value
+ can be safely used as a map key.
+
+ - Add method ``QPDF::setImmediateCopyFrom``. This further
+ enhances qpdf's ability to allow a ``QPDF`` object from which
+ objects are being copied to go out of scope before the
+ destination object is written. If you call this method on a
+ ``QPDF`` instances, objects copied *from* this instance will be
+ copied immediately instead of lazily. This option uses more
+ memory but allows the source object to go out of scope before
+ the destination object is written in all cases. See comments in
+ @1@filename@1@QPDF.hh@2@filename@2@ for details.
+
+ - Add method ``QPDFPageObjectHelper::getAttribute`` for
+ retrieving an attribute from the page dictionary taking
+ inheritance into consideration, and optionally making a copy if
+ your intention is to modify the attribute.
+
+ - Fix long-standing limitation of
+ ``QPDFPageObjectHelper::getPageImages`` so that it now properly
+ reports images from inherited resources dictionaries,
+ eliminating the need to call
+ ``QPDFPageDocumentHelper::pushInheritedAttributesToPage`` in
+ this case.
+
+ - Add method ``QPDFObjectHandle::getUniqueResourceName`` for
+ finding an unused name in a resource dictionary.
+
+ - Add method ``QPDFPageObjectHelper::getFormXObjectForPage`` for
+ generating a form XObject equivalent to a page. The resulting
+ object can be used in the same file or copied to another file
+ with ``copyForeignObject``. This can be useful for implementing
+ underlay, overlay, n-up, thumbnails, or any other functionality
+ requiring replication of pages in other contexts.
+
+ - Add method ``QPDFPageObjectHelper::placeFormXObject`` for
+ generating content stream text that places a given form XObject
+ on a page, centered and fit within a specified rectangle. This
+ method takes care of computing the proper transformation matrix
+ and may optionally compensate for rotation or scaling of the
+ destination page.
+
+ - Build Improvements
+
+ - Add new configure option
+ @1@option@1@--enable-avoid-windows-handle@2@option@2@, which
+ causes the preprocessor symbol ``AVOID_WINDOWS_HANDLE`` to be
+ defined. When defined, qpdf will avoid referencing the Windows
+ ``HANDLE`` type, which is disallowed with certain versions of
+ the Windows SDK.
+
+ - For Windows builds, attempt to determine what options, if any,
+ have to be passed to the compiler and linker to enable use of
+ ``wmain``. This causes the preprocessor symbol
+ ``WINDOWS_WMAIN`` to be defined. If you do your own builds with
+ other compilers, you can define this symbol to cause ``wmain``
+ to be used. This is needed to allow the Windows
+ @1@command@1@qpdf@2@command@2@ command to receive Unicode
+ command-line options.
+
+8.3.0: January 7, 2019
+ - Command-line Enhancements
+
+ - Shell completion: you can now use eval @1@command@1@$(qpdf
+ --completion-bash)@2@command@2@ and eval @1@command@1@$(qpdf
+ --completion-zsh)@2@command@2@ to enable shell completion for
+ bash and zsh.
+
+ - Page numbers (also known as page labels) are now preserved when
+ merging and splitting files with the
+ @1@option@1@--pages@2@option@2@ and
+ @1@option@1@--split-pages@2@option@2@ options.
+
+ - Bookmarks are partially preserved when splitting pages with the
+ @1@option@1@--split-pages@2@option@2@ option. Specifically, the
+ outlines dictionary and some supporting metadata are copied
+ into the split files. The result is that all bookmarks from the
+ original file appear, those that point to pages that are
+ preserved work, and those that point to pages that are not
+ preserved don't do anything. This is an interim step toward
+ proper support for bookmarks in splitting and merging
+ operations.
+
+ - Page collation: add new option
+ @1@option@1@--collate@2@option@2@. When specified, the
+ semantics of @1@option@1@--pages@2@option@2@ change from
+ concatenation to collation. See `Page Selection
+ Options <#ref.page-selection>`__ for examples and discussion.
+
+ - Generation of information in JSON format, primarily to
+ facilitate use of qpdf from languages other than C++. Add new
+ options @1@option@1@--json@2@option@2@,
+ @1@option@1@--json-key@2@option@2@, and
+ @1@option@1@--json-object@2@option@2@ to generate a JSON
+ representation of the PDF file. Run @1@command@1@qpdf
+ --json-help@2@command@2@ to get a description of the JSON
+ format. For more information, see `QPDF JSON <#ref.json>`__.
+
+ - The @1@option@1@--generate-appearances@2@option@2@ flag will
+ cause qpdf to generate appearances for form fields if the PDF
+ file indicates that form field appearances are out of date.
+ This can happen when PDF forms are filled in by a program that
+ doesn't know how to regenerate the appearances of the filled-in
+ fields.
+
+ - The @1@option@1@--flatten-annotations@2@option@2@ flag can be
+ used to *flatten* annotations, including form fields.
+ Ordinarily, annotations are drawn separately from the page.
+ Flattening annotations is the process of combining their
+ appearances into the page's contents. You might want to do this
+ if you are going to rotate or combine pages using a tool that
+ doesn't understand about annotations. You may also want to use
+ @1@option@1@--generate-appearances@2@option@2@ when using this
+ flag since annotations for outdated form fields are not
+ flattened as that would cause loss of information.
+
+ - The @1@option@1@--optimize-images@2@option@2@ flag tells qpdf
+ to recompresses every image using DCT (JPEG) compression as
+ long as the image is not already compressed with lossy
+ compression and recompressing the image reduces its size. The
+ additional options @1@option@1@--oi-min-width@2@option@2@,
+ @1@option@1@--oi-min-height@2@option@2@, and
+ @1@option@1@--oi-min-area@2@option@2@ prevent recompression of
+ images whose width, height, or pixel area (width × height) are
+ below a specified threshold.
+
+ - The @1@option@1@--show-object@2@option@2@ option can now be
+ given as @1@option@1@--show-object=trailer@2@option@2@ to show
+ the trailer dictionary.
+
+ - Bug Fixes and Enhancements
+
+ - QPDF now automatically detects and recovers from dangling
+ references. If a PDF file contained an indirect reference to a
+ non-existent object, which is valid, when adding a new object
+ to the file, it was possible for the new object to take the
+ object ID of the dangling reference, thereby causing the
+ dangling reference to point to the new object. This case is now
+ prevented.
+
+ - Fixes to form field setting code: strings are always written in
+ UTF-16 format, and checkboxes and radio buttons are handled
+ properly with respect to synchronization of values and
+ appearance states.
+
+ - The ``QPDF::checkLinearization()`` no longer causes the program
+ to crash when it detects problems with linearization data.
+ Instead, it issues a normal warning or error.
+
+ - Ordinarily qpdf treats an argument of the form
+ @1@option@1@@file@2@option@2@ to mean that command-line options
+ should be read from @1@filename@1@file@2@filename@2@. Now, if
+ @1@filename@1@file@2@filename@2@ does not exist but
+ @1@filename@1@@file@2@filename@2@ does, qpdf will treat
+ @1@filename@1@@file@2@filename@2@ as a regular option. This
+ makes it possible to work more easily with PDF files whose
+ names happen to start with the ``@`` character.
+
+ - Library Enhancements
+
+ - Remove the restriction in most cases that the source QPDF
+ object used in a ``QPDF::copyForeignObject`` call has to stick
+ around until the destination QPDF is written. The exceptional
+ case is when the source stream gets is data using a
+ QPDFObjectHandle::StreamDataProvider. For a more in-depth
+ discussion, see comments around ``copyForeignObject`` in
+ @1@filename@1@QPDF.hh@2@filename@2@.
+
+ - Add new method ``QPDFWriter::getFinalVersion()``, which returns
+ the PDF version that will ultimately be written to the final
+ file. See comments in @1@filename@1@QPDFWriter.hh@2@filename@2@
+ for some restrictions on its use.
+
+ - Add several methods for transcoding strings to some of the
+ character sets used in PDF files: ``QUtil::utf8_to_ascii``,
+ ``QUtil::utf8_to_win_ansi``, ``QUtil::utf8_to_mac_roman``, and
+ ``QUtil::utf8_to_utf16``. For the single-byte encodings that
+ support only a limited character sets, these methods replace
+ unsupported characters with a specified substitute.
+
+ - Add new methods to ``QPDFAnnotationObjectHelper`` and
+ ``QPDFFormFieldObjectHelper`` for querying flags and
+ interpretation of different field types. Define constants in
+ @1@filename@1@qpdf/Constants.h@2@filename@2@ to help with
+ interpretation of flag values.
+
+ - Add new methods
+ ``QPDFAcroFormDocumentHelper::generateAppearancesIfNeeded`` and
+ ``QPDFFormFieldObjectHelper::generateAppearance`` for
+ generating appearance streams. See discussion in
+ @1@filename@1@QPDFFormFieldObjectHelper.hh@2@filename@2@ for
+ limitations.
+
+ - Add two new helper functions for dealing with resource
+ dictionaries: ``QPDFObjectHandle::getResourceNames()`` returns
+ a list of all second-level keys, which correspond to the names
+ of resources, and ``QPDFObjectHandle::mergeResources()`` merges
+ two resources dictionaries as long as they have non-conflicting
+ keys. These methods are useful for certain types of objects
+ that resolve resources from multiple places, such as form
+ fields.
+
+ - Add methods ``QPDFPageDocumentHelper::flattenAnnotations()``
+ and
+ ``QPDFAnnotationObjectHelper::getPageContentForAppearance()``
+ for handling low-level details of annotation flattening.
+
+ - Add new helper classes: ``QPDFOutlineDocumentHelper``,
+ ``QPDFOutlineObjectHelper``, ``QPDFPageLabelDocumentHelper``,
+ ``QPDFNameTreeObjectHelper``, and
+ ``QPDFNumberTreeObjectHelper``.
+
+ - Add method ``QPDFObjectHandle::getJSON()`` that returns a JSON
+ representation of the object. Call ``serialize()`` on the
+ result to convert it to a string.
+
+ - Add a simple JSON serializer. This is not a complete or
+ general-purpose JSON library. It allows assembly and
+ serialization of JSON structures with some restrictions, which
+ are described in the header file. This is the serializer used
+ by qpdf's new JSON representation.
+
+ - Add new ``QPDFObjectHandle::Matrix`` class along with a few
+ convenience methods for dealing with six-element numerical
+ arrays as matrices.
+
+ - Add new method ``QPDFObjectHandle::wrapInArray``, which returns
+ the object itself if it is an array, or an array containing the
+ object otherwise. This is a common construct in PDF. This
+ method prevents you from having to explicitly test whether
+ something is a single element or an array.
+
+ - Build Improvements
+
+ - It is no longer necessary to run
+ @1@command@1@autogen.sh@2@command@2@ to build from a pristine
+ checkout. Automatically generated files are now committed so
+ that it is possible to build on platforms without autoconf
+ directly from a clean checkout of the repository. The
+ @1@command@1@configure@2@command@2@ script detects if the files
+ are out of date when it also determines that the tools are
+ present to regenerate them.
+
+ - Pull requests and the master branch are now built automatically
+ in `Azure
+ Pipelines <https://dev.azure.com/qpdf/qpdf/_build>`__, which is
+ free for open source projects. The build includes Linux, mac,
+ Windows 32-bit and 64-bit with mingw and MSVC, and an AppImage
+ build. Official qpdf releases are now built with Azure
+ Pipelines.
+
+ - Notes for Packagers
+
+ - A new section has been added to the documentation with notes
+ for packagers. Please see `Notes for
+ Packagers <#ref.packaging>`__.
+
+ - The qpdf detects out-of-date automatically generated files. If
+ your packaging system automatically refreshes libtool or
+ autoconf files, it could cause this check to fail. To avoid
+ this problem, pass
+ @1@option@1@--disable-check-autofiles@2@option@2@ to
+ @1@command@1@configure@2@command@2@.
+
+ - If you would like to have qpdf completion enabled
+ automatically, you can install completion files in the
+ distribution's default location. You can find sample completion
+ files to install in the @1@filename@1@completions@2@filename@2@
+ directory.
+
+8.2.1: August 18, 2018
+ - Command-line Enhancements
+
+ - Add
+ @1@option@1@--keep-files-open=@1@replaceable@1@[yn]@2@replaceable@2@@2@option@2@
+ to override default determination of whether to keep files open
+ when merging. Please see the discussion of
+ @1@option@1@--keep-files-open@2@option@2@ in `Basic
+ Options <#ref.basic-options>`__ for additional details.
+
+8.2.0: August 16, 2018
+ - Command-line Enhancements
+
+ - Add @1@option@1@--no-warn@2@option@2@ option to suppress
+ issuing warning messages. If there are any conditions that
+ would have caused warnings to be issued, the exit status is
+ still 3.
+
+ - Bug Fixes and Optimizations
+
+ - Performance fix: optimize page merging operation to avoid
+ unnecessary open/close calls on files being merged. This solves
+ a dramatic slow-down that was observed when merging certain
+ types of files.
+
+ - Optimize how memory was used for the TIFF predictor,
+ drastically improving performance and memory usage for files
+ containing high-resolution images compressed with Flate using
+ the TIFF predictor.
+
+ - Bug fix: end of line characters were not properly handled
+ inside strings in some cases.
+
+ - Bug fix: using @1@option@1@--progress@2@option@2@ on very small
+ files could cause an infinite loop.
+
+ - API enhancements
+
+ - Add new class ``QPDFSystemError``, derived from
+ ``std::runtime_error``, which is now thrown by
+ ``QUtil::throw_system_error``. This enables the triggering
+ ``errno`` value to be retrieved.
+
+ - Add ``ClosedFileInputSource::stayOpen`` method, enabling a
+ ``ClosedFileInputSource`` to stay open during manually
+ indicated periods of high activity, thus reducing the overhead
+ of frequent open/close operations.
+
+ - Build Changes
+
+ - For the mingw builds, change the name of the DLL import library
+ from @1@filename@1@libqpdf.a@2@filename@2@ to
+ @1@filename@1@libqpdf.dll.a@2@filename@2@ to more accurately
+ reflect that it is an import library rather than a static
+ library. This potentially clears the way for supporting a
+ static library in the future, though presently, the qpdf
+ Windows build only builds the DLL and executables.
+
+8.1.0: June 23, 2018
+ - Usability Improvements
+
+ - When splitting files, qpdf detects fonts and images that the
+ document metadata claims are referenced from a page but are not
+ actually referenced and omits them from the output file. This
+ change can cause a significant reduction in the size of split
+ PDF files for files created by some software packages. In some
+ cases, it can also make page splitting slower. Prior versions
+ of qpdf would believe the document metadata and sometimes
+ include all the images from all the other pages even though the
+ pages were no longer present. In the unlikely event that the
+ old behavior should be desired, or if you have a case where
+ page splitting is very slow, the old behavior (and speed) can
+ be enabled by specifying
+ @1@option@1@--preserve-unreferenced-resources@2@option@2@. For
+ additional details, please see `Advanced Transformation
+ Options <#ref.advanced-transformation>`__.
+
+ - When merging multiple PDF files, qpdf no longer leaves all the
+ files open. This makes it possible to merge numbers of files
+ that may exceed the operating system's limit for the maximum
+ number of open files.
+
+ - The @1@option@1@--rotate@2@option@2@ option's syntax has been
+ extended to make the page range optional. If you specify
+ @1@option@1@--rotate=@1@replaceable@1@angle@2@replaceable@2@@2@option@2@
+ without specifying a page range, the rotation will be applied
+ to all pages. This can be especially useful for adjusting a PDF
+ created from a multi-page document that was scanned upside
+ down.
+
+ - When merging multiple files, the
+ @1@option@1@--verbose@2@option@2@ option now prints information
+ about each file as it operates on that file.
+
+ - When the @1@option@1@--progress@2@option@2@ option is
+ specified, qpdf will print a running indicator of its best
+ guess at how far through the writing process it is. Note that,
+ as with all progress meters, it's an approximation. This option
+ is implemented in a way that makes it useful for software that
+ uses the qpdf library; see API Enhancements below.
+
+ - Bug Fixes
+
+ - Properly decrypt files that use revision 3 of the standard
+ security handler but use 40 bit keys (even though revision 3
+ supports 128-bit keys).
+
+ - Limit depth of nested data structures to prevent crashes from
+ certain types of malformed (malicious) PDFs.
+
+ - In "newline before endstream" mode, insert the required extra
+ newline before the ``endstream`` at the end of object streams.
+ This one case was previously omitted.
+
+ - API Enhancements
+
+ - The first round of higher level "helper" interfaces has been
+ introduced. These are designed to provide a more convenient way
+ of interacting with certain document features than using
+ ``QPDFObjectHandle`` directly. For details on helpers, see
+ `Helper Classes <#ref.helper-classes>`__. Specific additional
+ interfaces are described below.
+
+ - Add two new document helper classes: ``QPDFPageDocumentHelper``
+ for working with pages, and ``QPDFAcroFormDocumentHelper`` for
+ working with interactive forms. No old methods have been
+ removed, but ``QPDFPageDocumentHelper`` is now the preferred
+ way to perform operations on pages rather than calling the old
+ methods in ``QPDFObjectHandle`` and ``QPDF`` directly. Comments
+ in the header files direct you to the new interfaces. Please
+ see the header files and @1@filename@1@ChangeLog@2@filename@2@
+ for additional details.
+
+ - Add three new object helper class: ``QPDFPageObjectHelper`` for
+ pages, ``QPDFFormFieldObjectHelper`` for interactive form
+ fields, and ``QPDFAnnotationObjectHelper`` for annotations. All
+ three classes are fairly sparse at the moment, but they have
+ some useful, basic functionality.
+
+ - A new example program
+ @1@filename@1@examples/pdf-set-form-values.cc@2@filename@2@ has
+ been added that illustrates use of the new document and object
+ helpers.
+
+ - The method ``QPDFWriter::registerProgressReporter`` has been
+ added. This method allows you to register a function that is
+ called by ``QPDFWriter`` to update your idea of the percentage
+ it thinks it is through writing its output. Client programs can
+ use this to implement reasonably accurate progress meters. The
+ @1@command@1@qpdf@2@command@2@ command line tool uses this to
+ implement its @1@option@1@--progress@2@option@2@ option.
+
+ - New methods ``QPDFObjectHandle::newUnicodeString`` and
+ ``QPDFObject::unparseBinary`` have been added to allow for more
+ convenient creation of strings that are explicitly encoded
+ using big-endian UTF-16. This is useful for creating strings
+ that appear outside of content streams, such as labels, form
+ fields, outlines, document metadata, etc.
+
+ - A new class ``QPDFObjectHandle::Rectangle`` has been added to
+ ease working with PDF rectangles, which are just arrays of four
+ numeric values.
+
+8.0.2: March 6, 2018
+ - When a loop is detected while following cross reference streams or
+ tables, treat this as damage instead of silently ignoring the
+ previous table. This prevents loss of otherwise recoverable data
+ in some damaged files.
+
+ - Properly handle pages with no contents.
+
+8.0.1: March 4, 2018
+ - Disregard data check errors when uncompressing ``/FlateDecode``
+ streams. This is consistent with most other PDF readers and allows
+ qpdf to recover data from another class of malformed PDF files.
+
+ - On the command line when specifying page ranges, support preceding
+ a page number by "r" to indicate that it should be counted from
+ the end. For example, the range ``r3-r1`` would indicate the last
+ three pages of a document.
+
+8.0.0: February 25, 2018
+ - Packaging and Distribution Changes
+
+ - QPDF is now distributed as an
+ `AppImage <https://appimage.org/>`__ in addition to all the
+ other ways it is distributed. The AppImage can be found in the
+ download area with the other packages. Thanks to Kurt Pfeifle
+ and Simon Peter for their contributions.
+
+ - Bug Fixes
+
+ - ``QPDFObjectHandle::getUTF8Val`` now properly treats
+ non-Unicode strings as encoded with PDF Doc Encoding.
+
+ - Improvements to handling of objects in PDF files that are not
+ of the expected type. In most cases, qpdf will be able to warn
+ for such cases rather than fail with an exception. Previous
+ versions of qpdf would sometimes fail with errors such as
+ "operation for dictionary object attempted on object of wrong
+ type". This situation should be mostly or entirely eliminated
+ now.
+
+ - Enhancements to the @1@command@1@qpdf@2@command@2@ Command-line
+ Tool. All new options listed here are documented in more detail in
+ `Running QPDF <#ref.using>`__.
+
+ - The option
+ @1@option@1@--linearize-pass1=@1@replaceable@1@file@2@replaceable@2@@2@option@2@
+ has been added for debugging qpdf's linearization code.
+
+ - The option @1@option@1@--coalesce-contents@2@option@2@ can be
+ used to combine content streams of a page whose contents are an
+ array of streams into a single stream.
+
+ - API Enhancements. All new API calls are documented in their
+ respective classes' header files. There are no non-compatible
+ changes to the API.
+
+ - Add function ``qpdf_check_pdf`` to the C API. This function
+ does basic checking that is a subset of what @1@command@1@qpdf
+ --check@2@command@2@ performs.
+
+ - Major enhancements to the lexical layer of qpdf. For a complete
+ list of enhancements, please refer to the
+ @1@filename@1@ChangeLog@2@filename@2@ file. Most of the changes
+ result in improvements to qpdf's ability handle erroneous
+ files. It is also possible for programs to handle whitespace,
+ comments, and inline images as tokens.
+
+ - New API for working with PDF content streams at a lexical
+ level. The new class ``QPDFObjectHandle::TokenFilter`` allows
+ the developer to provide token handlers. Token filters can be
+ used with several different methods in ``QPDFObjectHandle`` as
+ well as with a lower-level interface. See comments in
+ @1@filename@1@QPDFObjectHandle.hh@2@filename@2@ as well as the
+ new examples
+ @1@filename@1@examples/pdf-filter-tokens.cc@2@filename@2@ and
+ @1@filename@1@examples/pdf-count-strings.cc@2@filename@2@ for
+ details.
+
+7.1.1: February 4, 2018
+ - Bug fix: files whose /ID fields were other than 16 bytes long can
+ now be properly linearized
+
+ - A few compile and link issues have been corrected for some
+ platforms.
+
+7.1.0: January 14, 2018
+ - PDF files contain streams that may be compressed with various
+ compression algorithms which, in some cases, may be enhanced by
+ various predictor functions. Previously only the PNG up predictor
+ was supported. In this version, all the PNG predictors as well as
+ the TIFF predictor are supported. This increases the range of
+ files that qpdf is able to handle.
+
+ - QPDF now allows a raw encryption key to be specified in place of a
+ password when opening encrypted files, and will optionally display
+ the encryption key used by a file. This is a non-standard
+ operation, but it can be useful in certain situations. Please see
+ the discussion of @1@option@1@--password-is-hex-key@2@option@2@ in
+ `Basic Options <#ref.basic-options>`__ or the comments around
+ ``QPDF::setPasswordIsHexKey`` in
+ @1@filename@1@QPDF.hh@2@filename@2@ for additional details.
+
+ - Bug fix: numbers ending with a trailing decimal point are now
+ properly recognized as numbers.
+
+ - Bug fix: when building qpdf from source on some platforms
+ (especially MacOS), the build could get confused by older versions
+ of qpdf installed on the system. This has been corrected.
+
+7.0.0: September 15, 2017
+ - Packaging and Distribution Changes
+
+ - QPDF's primary license is now `version 2.0 of the Apache
+ License <http://www.apache.org/licenses/LICENSE-2.0>`__ rather
+ than version 2.0 of the Artistic License. You may still, at
+ your option, consider qpdf to be licensed with version 2.0 of
+ the Artistic license.
+
+ - QPDF no longer has a dependency on the PCRE (Perl-Compatible
+ Regular Expression) library. QPDF now has an added dependency
+ on the JPEG library.
+
+ - Bug Fixes
+
+ - This release contains many bug fixes for various infinite
+ loops, memory leaks, and other memory errors that could be
+ encountered with specially crafted or otherwise erroneous PDF
+ files.
+
+ - New Features
+
+ - QPDF now supports reading and writing streams encoded with JPEG
+ or RunLength encoding. Library API enhancements and
+ command-line options have been added to control this behavior.
+ See command-line options
+ @1@option@1@--compress-streams@2@option@2@ and
+ @1@option@1@--decode-level@2@option@2@ and methods
+ ``QPDFWriter::setCompressStreams`` and
+ ``QPDFWriter::setDecodeLevel``.
+
+ - QPDF is much better at recovering from broken files. In most
+ cases, qpdf will skip invalid objects and will preserve broken
+ stream data by not attempting to filter broken streams. QPDF is
+ now able to recover or at least not crash on dozens of broken
+ test files I have received over the past few years.
+
+ - Page rotation is now supported and accessible from both the
+ library and the command line.
+
+ - ``QPDFWriter`` supports writing files in a way that preserves
+ PCLm compliance in support of driverless printing. This is very
+ specialized and is only useful to applications that already
+ know how to create PCLm files.
+
+ - Enhancements to the @1@command@1@qpdf@2@command@2@ Command-line
+ Tool. All new options listed here are documented in more detail in
+ `Running QPDF <#ref.using>`__.
+
+ - Command-line arguments can now be read from files or standard
+ input using ``@file`` or ``@-`` syntax. Please see `Basic
+ Invocation <#ref.invocation>`__.
+
+ - @1@option@1@--rotate@2@option@2@: request page rotation
+
+ - @1@option@1@--newline-before-endstream@2@option@2@: ensure that
+ a newline appears before every ``endstream`` keyword in the
+ file; used to prevent qpdf from breaking PDF/A compliance on
+ already compliant files.
+
+ - @1@option@1@--preserve-unreferenced@2@option@2@: preserve
+ unreferenced objects in the input PDF
+
+ - @1@option@1@--split-pages@2@option@2@: break output into chunks
+ with fixed numbers of pages
+
+ - @1@option@1@--verbose@2@option@2@: print the name of each
+ output file that is created
+
+ - @1@option@1@--compress-streams@2@option@2@ and
+ @1@option@1@--decode-level@2@option@2@ replace
+ @1@option@1@--stream-data@2@option@2@ for improving granularity
+ of controlling compression and decompression of stream data.
+ The @1@option@1@--stream-data@2@option@2@ option will remain
+ available.
+
+ - When running @1@command@1@qpdf --check@2@command@2@ with other
+ options, checks are always run first. This enables qpdf to
+ perform its full recovery logic before outputting other
+ information. This can be especially useful when manually
+ recovering broken files, looking at qpdf's regenerated cross
+ reference table, or other similar operations.
+
+ - Process @1@command@1@--pages@2@command@2@ earlier so that other
+ options like @1@option@1@--show-pages@2@option@2@ or
+ @1@option@1@--split-pages@2@option@2@ can operate on the file
+ after page splitting/merging has occurred.
+
+ - API Changes. All new API calls are documented in their respective
+ classes' header files.
+
+ - ``QPDFObjectHandle::rotatePage``: apply rotation to a page
+ object
+
+ - ``QPDFWriter::setNewlineBeforeEndstream``: force newline to
+ appear before ``endstream``
+
+ - ``QPDFWriter::setPreserveUnreferencedObjects``: preserve
+ unreferenced objects that appear in the input PDF. The default
+ behavior is to discard them.
+
+ - New ``Pipeline`` types ``Pl_RunLength`` and ``Pl_DCT`` are
+ available for developers who wish to produce or consume
+ RunLength or DCT stream data directly. The
+ @1@filename@1@examples/pdf-create.cc@2@filename@2@ example
+ illustrates their use.
+
+ - ``QPDFWriter::setCompressStreams`` and
+ ``QPDFWriter::setDecodeLevel`` methods control handling of
+ different types of stream compression.
+
+ - Add new C API functions ``qpdf_set_compress_streams``,
+ ``qpdf_set_decode_level``,
+ ``qpdf_set_preserve_unreferenced_objects``, and
+ ``qpdf_set_newline_before_endstream`` corresponding to the new
+ ``QPDFWriter`` methods.
+
+6.0.0: November 10, 2015
+ - Implement @1@option@1@--deterministic-id@2@option@2@ command-line
+ option and ``QPDFWriter::setDeterministicID`` as well as C API
+ function ``qpdf_set_deterministic_ID`` for generating a
+ deterministic ID for non-encrypted files. When this option is
+ selected, the ID of the file depends on the contents of the output
+ file, and not on transient items such as the timestamp or output
+ file name.
+
+ - Make qpdf more tolerant of files whose xref table entries are not
+ the correct length.
+
+5.1.3: May 24, 2015
+ - Bug fix: fix-qdf was not properly handling files that contained
+ object streams with more than 255 objects in them.
+
+ - Bug fix: qpdf was not properly initializing Microsoft's secure
+ crypto provider on fresh Windows installations that had not had
+ any keys created yet.
+
+ - Fix a few errors found by Gynvael Coldwind and Mateusz Jurczyk of
+ the Google Security Team. Please see the ChangeLog for details.
+
+ - Properly handle pages that have no contents at all. There were
+ many cases in which qpdf handled this fine, but a few methods
+ blindly obtained page contents with handling the possibility that
+ there were no contents.
+
+ - Make qpdf more robust for a few more kinds of problems that may
+ occur in invalid PDF files.
+
+5.1.2: June 7, 2014
+ - Bug fix: linearizing files could create a corrupted output file
+ under extremely unlikely file size circumstances. See ChangeLog
+ for details. The odds of getting hit by this are very low, though
+ one person did.
+
+ - Bug fix: qpdf would fail to write files that had streams with
+ decode parameters referencing other streams.
+
+ - New example program: @1@command@1@pdf-split-pages@2@command@2@:
+ efficiently split PDF files into individual pages. The example
+ program does this more efficiently than using @1@command@1@qpdf
+ --pages@2@command@2@ to do it.
+
+ - Packaging fix: Visual C++ binaries did not support Windows XP.
+ This has been rectified by updating the compilers used to generate
+ the release binaries.
+
+5.1.1: January 14, 2014
+ - Performance fix: copying foreign objects could be very slow with
+ certain types of files. This was most likely to be visible during
+ page splitting and was due to traversing the same objects multiple
+ times in some cases.
+
+5.1.0: December 17, 2013
+ - Added runtime option (``QUtil::setRandomDataProvider``) to supply
+ your own random data provider. You can use this if you want to
+ avoid using the OS-provided secure random number generation
+ facility or stdlib's less secure version. See comments in
+ include/qpdf/QUtil.hh for details.
+
+ - Fixed image comparison tests to not create 12-bit-per-pixel images
+ since some versions of tiffcmp have bugs in comparing them in some
+ cases. This increases the disk space required by the image
+ comparison tests, which are off by default anyway.
+
+ - Introduce a number of small fixes for compilation on the latest
+ clang in MacOS and the latest Visual C++ in Windows.
+
+ - Be able to handle broken files that end the xref table header with
+ a space instead of a newline.
+
+5.0.1: October 18, 2013
+ - Thanks to a detailed review by Florian Weimer and the Red Hat
+ Product Security Team, this release includes a number of
+ non-user-visible security hardening changes. Please see the
+ ChangeLog file in the source distribution for the complete list.
+
+ - When available, operating system-specific secure random number
+ generation is used for generating initialization vectors and other
+ random values used during encryption or file creation. For the
+ Windows build, this results in an added dependency on Microsoft's
+ cryptography API. To disable the OS-specific cryptography and use
+ the old version, pass the
+ @1@option@1@--enable-insecure-random@2@option@2@ option to
+ @1@command@1@./configure@2@command@2@.
+
+ - The @1@command@1@qpdf@2@command@2@ command-line tool now issues a
+ warning when @1@option@1@-accessibility=n@2@option@2@ is specified
+ for newer encryption versions stating that the option is ignored.
+ qpdf, per the spec, has always ignored this flag, but it
+ previously did so silently. This warning is issued only by the
+ command-line tool, not by the library. The library's handling of
+ this flag is unchanged.
+
+5.0.0: July 10, 2013
+ - Bug fix: previous versions of qpdf would lose objects with
+ generation != 0 when generating object streams. Fixing this
+ required changes to the public API.
+
+ - Removed methods from public API that were only supposed to be
+ called by QPDFWriter and couldn't realistically be called anywhere
+ else. See ChangeLog for details.
+
+ - New ``QPDFObjGen`` class added to represent an object
+ ID/generation pair. ``QPDFObjectHandle::getObjGen()`` is now
+ preferred over ``QPDFObjectHandle::getObjectID()`` and
+ ``QPDFObjectHandle::getGeneration()`` as it makes it less likely
+ for people to accidentally write code that ignores the generation
+ number. See @1@filename@1@QPDF.hh@2@filename@2@ and
+ @1@filename@1@QPDFObjectHandle.hh@2@filename@2@ for additional
+ notes.
+
+ - Add @1@option@1@--show-npages@2@option@2@ command-line option to
+ the @1@command@1@qpdf@2@command@2@ command to show the number of
+ pages in a file.
+
+ - Allow omission of the page range within
+ @1@option@1@--pages@2@option@2@ for the
+ @1@command@1@qpdf@2@command@2@ command. When omitted, the page
+ range is implicitly taken to be all the pages in the file.
+
+ - Various enhancements were made to support different types of
+ broken files or broken readers. Details can be found in
+ @1@filename@1@ChangeLog@2@filename@2@.
+
+4.1.0: April 14, 2013
+ - Note to people including qpdf in distributions: the
+ @1@filename@1@.la@2@filename@2@ files generated by libtool are now
+ installed by qpdf's @1@command@1@make install@2@command@2@ target.
+ Before, they were not installed. This means that if your
+ distribution does not want to include
+ @1@filename@1@.la@2@filename@2@ files, you must remove them as
+ part of your packaging process.
+
+ - Major enhancement: API enhancements have been made to support
+ parsing of content streams. This enhancement includes the
+ following changes:
+
+ - ``QPDFObjectHandle::parseContentStream`` method parses objects
+ in a content stream and calls handlers in a callback class. The
+ example
+ @1@filename@1@examples/pdf-parse-content.cc@2@filename@2@
+ illustrates how this may be used.
+
+ - ``QPDFObjectHandle`` can now represent operators and inline
+ images, object types that may only appear in content streams.
+
+ - Method ``QPDFObjectHandle::getTypeCode()`` returns an
+ enumerated type value representing the underlying object type.
+ Method ``QPDFObjectHandle::getTypeName()`` returns a text
+ string describing the name of the type of a
+ ``QPDFObjectHandle`` object. These methods can be used for more
+ efficient parsing and debugging/diagnostic messages.
+
+ - @1@command@1@qpdf --check@2@command@2@ now parses all pages'
+ content streams in addition to doing other checks. While there are
+ still many types of errors that cannot be detected, syntactic
+ errors in content streams will now be reported.
+
+ - Minor compilation enhancements have been made to facilitate easier
+ for support for a broader range of compilers and compiler
+ versions.
+
+ - Warning flags have been moved into a separate variable in
+ @1@filename@1@autoconf.mk@2@filename@2@
+
+ - The configure flag @1@option@1@--enable-werror@2@option@2@ work
+ for Microsoft compilers
+
+ - All MSVC CRT security warnings have been resolved.
+
+ - All C-style casts in C++ Code have been replaced by C++ casts,
+ and many casts that had been included to suppress higher
+ warning levels for some compilers have been removed, primarily
+ for clarity. Places where integer type coercion occurs have
+ been scrutinized. A new casting policy has been documented in
+ the manual. This is of concern mainly to people porting qpdf to
+ new platforms or compilers. It is not visible to programmers
+ writing code that uses the library
+
+ - Some internal limits have been removed in code that converts
+ numbers to strings. This is largely invisible to users, but it
+ does trigger a bug in some older versions of mingw-w64's C++
+ library. See @1@filename@1@README-windows.md@2@filename@2@ in
+ the source distribution if you think this may affect you. The
+ copy of the DLL distributed with qpdf's binary distribution is
+ not affected by this problem.
+
+ - The RPM spec file previously included with qpdf has been removed.
+ This is because virtually all Linux distributions include qpdf now
+ that it is a dependency of CUPS filters.
+
+ - A few bug fixes are included:
+
+ - Overridden compressed objects are properly handled. Before,
+ there were certain constructs that could cause qpdf to see old
+ versions of some objects. The most usual manifestation of this
+ was loss of filled in form values for certain files.
+
+ - Installation no longer uses GNU/Linux-specific versions of some
+ commands, so @1@command@1@make install@2@command@2@ works on
+ Solaris with native tools.
+
+ - The 64-bit mingw Windows binary package no longer includes a
+ 32-bit DLL.
+
+4.0.1: January 17, 2013
+ - Fix detection of binary attachments in test suite to avoid false
+ test failures on some platforms.
+
+ - Add clarifying comment in @1@filename@1@QPDF.hh@2@filename@2@ to
+ methods that return the user password explaining that it is no
+ longer possible with newer encryption formats to recover the user
+ password knowing the owner password. In earlier encryption
+ formats, the user password was encrypted in the file using the
+ owner password. In newer encryption formats, a separate encryption
+ key is used on the file, and that key is independently encrypted
+ using both the user password and the owner password.
+
+4.0.0: December 31, 2012
+ - Major enhancement: support has been added for newer encryption
+ schemes supported by version X of Adobe Acrobat. This includes use
+ of 127-character passwords, 256-bit encryption keys, and the
+ encryption scheme specified in ISO 32000-2, the PDF 2.0
+ specification. This scheme can be chosen from the command line by
+ specifying use of 256-bit keys. qpdf also supports the deprecated
+ encryption method used by Acrobat IX. This encryption style has
+ known security weaknesses and should not be used in practice.
+ However, such files exist "in the wild," so support for this
+ scheme is still useful. New methods
+ ``QPDFWriter::setR6EncryptionParameters`` (for the PDF 2.0 scheme)
+ and ``QPDFWriter::setR5EncryptionParameters`` (for the deprecated
+ scheme) have been added to enable these new encryption schemes.
+ Corresponding functions have been added to the C API as well.
+
+ - Full support for Adobe extension levels in PDF version
+ information. Starting with PDF version 1.7, corresponding to ISO
+ 32000, Adobe adds new functionality by increasing the extension
+ level rather than increasing the version. This support includes
+ addition of the ``QPDF::getExtensionLevel`` method for retrieving
+ the document's extension level, addition of versions of
+ ``QPDFWriter::setMinimumPDFVersion`` and
+ ``QPDFWriter::forcePDFVersion`` that accept an extension level,
+ and extended syntax for specifying forced and minimum versions on
+ the command line as described in `Advanced Transformation
+ Options <#ref.advanced-transformation>`__. Corresponding functions
+ have been added to the C API as well.
+
+ - Minor fixes to prevent qpdf from referencing objects in the file
+ that are not referenced in the file's overall structure. Most
+ files don't have any such objects, but some files have contain
+ unreferenced objects with errors, so these fixes prevent qpdf from
+ needlessly rejecting or complaining about such objects.
+
+ - Add new generalized methods for reading and writing files from/to
+ programmer-defined sources. The method
+ ``QPDF::processInputSource`` allows the programmer to use any
+ input source for the input file, and
+ ``QPDFWriter::setOutputPipeline`` allows the programmer to write
+ the output file through any pipeline. These methods would make it
+ possible to perform any number of specialized operations, such as
+ accessing external storage systems, creating bindings for qpdf in
+ other programming languages that have their own I/O systems, etc.
+
+ - Add new method ``QPDF::getEncryptionKey`` for retrieving the
+ underlying encryption key used in the file.
+
+ - This release includes a small handful of non-compatible API
+ changes. While effort is made to avoid such changes, all the
+ non-compatible API changes in this version were to parts of the
+ API that would likely never be used outside the library itself. In
+ all cases, the altered methods or structures were parts of the
+ ``QPDF`` that were public to enable them to be called from either
+ ``QPDFWriter`` or were part of validation code that was
+ over-zealous in reporting problems in parts of the file that would
+ not ordinarily be referenced. In no case did any of the removed
+ methods do anything worse that falsely report error conditions in
+ files that were broken in ways that didn't matter. The following
+ public parts of the ``QPDF`` class were changed in a
+ non-compatible way:
+
+ - Updated nested ``QPDF::EncryptionData`` class to add fields
+ needed by the newer encryption formats, member variables
+ changed to private so that future changes will not require
+ breaking backward compatibility.
+
+ - Added additional parameters to ``compute_data_key``, which is
+ used by ``QPDFWriter`` to compute the encryption key used to
+ encrypt a specific object.
+
+ - Removed the method ``flattenScalarReferences``. This method was
+ previously used prior to writing a new PDF file, but it has the
+ undesired side effect of causing qpdf to read objects in the
+ file that were not referenced. Some otherwise files have
+ unreferenced objects with errors in them, so this could cause
+ qpdf to reject files that would be accepted by virtually all
+ other PDF readers. In fact, qpdf relied on only a very small
+ part of what flattenScalarReferences did, so only this part has
+ been preserved, and it is now done directly inside
+ ``QPDFWriter``.
+
+ - Removed the method ``decodeStreams``. This method was used by
+ the @1@option@1@--check@2@option@2@ option of the
+ @1@command@1@qpdf@2@command@2@ command-line tool to force all
+ streams in the file to be decoded, but it also suffered from
+ the problem of opening otherwise unreferenced streams and thus
+ could report false positive. The
+ @1@option@1@--check@2@option@2@ option now causes qpdf to go
+ through all the motions of writing a new file based on the
+ original one, so it will always reference and check exactly
+ those parts of a file that any ordinary viewer would check.
+
+ - Removed the method ``trimTrailerForWrite``. This method was
+ used by ``QPDFWriter`` to modify the original QPDF object by
+ removing fields from the trailer dictionary that wouldn't apply
+ to the newly written file. This functionality, though generally
+ harmless, was a poor implementation and has been replaced by
+ having QPDFWriter filter these out when copying the trailer
+ rather than modifying the original QPDF object. (Note that qpdf
+ never modifies the original file itself.)
+
+ - Allow the PDF header to appear anywhere in the first 1024 bytes of
+ the file. This is consistent with what other readers do.
+
+ - Fix the @1@command@1@pkg-config@2@command@2@ files to list zlib
+ and pcre in ``Requires.private`` to better support static linking
+ using @1@command@1@pkg-config@2@command@2@.
+
+3.0.2: September 6, 2012
+ - Bug fix: ``QPDFWriter::setOutputMemory`` did not work when not
+ used with ``QPDFWriter::setStaticID``, which made it pretty much
+ useless. This has been fixed.
+
+ - New API call ``QPDFWriter::setExtraHeaderText`` inserts additional
+ text near the header of the PDF file. The intended use case is to
+ insert comments that may be consumed by a downstream application,
+ though other use cases may exist.
+
+3.0.1: August 11, 2012
+ - Version 3.0.0 included addition of files for
+ @1@command@1@pkg-config@2@command@2@, but this was not mentioned
+ in the release notes. The release notes for 3.0.0 were updated to
+ mention this.
+
+ - Bug fix: if an object stream ended with a scalar object not
+ followed by space, qpdf would incorrectly report that it
+ encountered a premature EOF. This bug has been in qpdf since
+ version 2.0.
+
+3.0.0: August 2, 2012
+ - Acknowledgment: I would like to express gratitude for the
+ contributions of Tobias Hoffmann toward the release of qpdf
+ version 3.0. He is responsible for most of the implementation and
+ design of the new API for manipulating pages, and contributed code
+ and ideas for many of the improvements made in version 3.0.
+ Without his work, this release would certainly not have happened
+ as soon as it did, if at all.
+
+ - *Non-compatible API change:* The version of
+ ``QPDFObjectHandle::replaceStreamData`` that uses a
+ ``StreamDataProvider`` no longer requires (or accepts) a
+ ``length`` parameter. See
+ `appendix_title <#ref.upgrading-to-3.0>`__ for an explanation.
+ While care is taken to avoid non-compatible API changes in
+ general, an exception was made this time because the new interface
+ offers an opportunity to significantly simplify calling code.
+
+ - Support has been added for large files. The test suite verifies
+ support for files larger than 4 gigabytes, and manual testing has
+ verified support for files larger than 10 gigabytes. Large file
+ support is available for both 32-bit and 64-bit platforms as long
+ as the compiler and underlying platforms support it.
+
+ - Support for page selection (splitting and merging PDF files) has
+ been added to the @1@command@1@qpdf@2@command@2@ command-line
+ tool. See `Page Selection Options <#ref.page-selection>`__.
+
+ - Options have been added to the @1@command@1@qpdf@2@command@2@
+ command-line tool for copying encryption parameters from another
+ file. See `Basic Options <#ref.basic-options>`__.
+
+ - New methods have been added to the ``QPDF`` object for adding and
+ removing pages. See `Adding and Removing
+ Pages <#ref.adding-and-remove-pages>`__.
+
+ - New methods have been added to the ``QPDF`` object for copying
+ objects from other PDF files. See `Copying Objects From Other PDF
+ Files <#ref.foreign-objects>`__
+
+ - A new method ``QPDFObjectHandle::parse`` has been added for
+ constructing ``QPDFObjectHandle`` objects from a string
+ description.
+
+ - Methods have been added to ``QPDFWriter`` to allow writing to an
+ already open stdio ``FILE*`` addition to writing to standard
+ output or a named file. Methods have been added to ``QPDF`` to be
+ able to process a file from an already open stdio ``FILE*``. This
+ makes it possible to read and write PDF from secure temporary
+ files that have been unlinked prior to being fully read or
+ written.
+
+ - The ``QPDF::emptyPDF`` can be used to allow creation of PDF files
+ from scratch. The example
+ @1@filename@1@examples/pdf-create.cc@2@filename@2@ illustrates how
+ it can be used.
+
+ - Several methods to take ``PointerHolder<Buffer>`` can now also
+ accept ``std::string`` arguments.
+
+ - Many new convenience methods have been added to the library, most
+ in ``QPDFObjectHandle``. See @1@filename@1@ChangeLog@2@filename@2@
+ for a full list.
+
+ - When building on a platform that supports ELF shared libraries
+ (such as Linux), symbol versions are enabled by default. They can
+ be disabled by passing
+ @1@option@1@--disable-ld-version-script@2@option@2@ to
+ @1@command@1@./configure@2@command@2@.
+
+ - The file @1@filename@1@libqpdf.pc@2@filename@2@ is now installed
+ to support @1@command@1@pkg-config@2@command@2@.
+
+ - Image comparison tests are off by default now since they are not
+ needed to verify a correct build or port of qpdf. They are needed
+ only when changing the actual PDF output generated by qpdf. You
+ should enable them if you are making deep changes to qpdf itself.
+ See @1@filename@1@README.md@2@filename@2@ for details.
+
+ - Large file tests are off by default but can be turned on with
+ @1@command@1@./configure@2@command@2@ or by setting an environment
+ variable before running the test suite. See
+ @1@filename@1@README.md@2@filename@2@ for details.
+
+ - When qpdf's test suite fails, failures are not printed to the
+ terminal anymore by default. Instead, find them in
+ @1@filename@1@build/qtest.log@2@filename@2@. For packagers who are
+ building with an autobuilder, you can add the
+ @1@option@1@--enable-show-failed-test-output@2@option@2@ option to
+ @1@command@1@./configure@2@command@2@ to restore the old behavior.
+
+2.3.1: December 28, 2011
+ - Fix thread-safety problem resulting from non-thread-safe use of
+ the PCRE library.
+
+ - Made a few minor documentation fixes.
+
+ - Add workaround for a bug that appears in some versions of
+ ghostscript to the test suite
+
+ - Fix minor build issue for Visual C++ 2010.
+
+2.3.0: August 11, 2011
+ - Bug fix: when preserving existing encryption on encrypted files
+ with cleartext metadata, older qpdf versions would generate
+ password-protected files with no valid password. This operation
+ now works. This bug only affected files created by copying
+ existing encryption parameters; explicit encryption with
+ specification of cleartext metadata worked before and continues to
+ work.
+
+ - Enhance ``QPDFWriter`` with a new constructor that allows you to
+ delay the specification of the output file. When using this
+ constructor, you may now call ``QPDFWriter::setOutputFilename`` to
+ specify the output file, or you may use
+ ``QPDFWriter::setOutputMemory`` to cause ``QPDFWriter`` to write
+ the resulting PDF file to a memory buffer. You may then use
+ ``QPDFWriter::getBuffer`` to retrieve the memory buffer.
+
+ - Add new API call ``QPDF::replaceObject`` for replacing objects by
+ object ID
+
+ - Add new API call ``QPDF::swapObjects`` for swapping two objects by
+ object ID
+
+ - Add ``QPDFObjectHandle::getDictAsMap`` and
+ ``QPDFObjectHandle::getArrayAsVector`` to allow retrieval of
+ dictionary objects as maps and array objects as vectors.
+
+ - Add functions ``qpdf_get_info_key`` and ``qpdf_set_info_key`` to
+ the C API for manipulating string fields of the document's
+ ``/Info`` dictionary.
+
+ - Add functions ``qpdf_init_write_memory``,
+ ``qpdf_get_buffer_length``, and ``qpdf_get_buffer`` to the C API
+ for writing PDF files to a memory buffer instead of a file.
+
+2.2.4: June 25, 2011
+ - Fix installation and compilation issues; no functionality changes.
+
+2.2.3: April 30, 2011
+ - Handle some damaged streams with incorrect characters following
+ the stream keyword.
+
+ - Improve handling of inline images when normalizing content
+ streams.
+
+ - Enhance error recovery to properly handle files that use object 0
+ as a regular object, which is specifically disallowed by the spec.
+
+2.2.2: October 4, 2010
+ - Add new function ``qpdf_read_memory`` to the C API to call
+ ``QPDF::processMemoryFile``. This was an omission in qpdf 2.2.1.
+
+2.2.1: October 1, 2010
+ - Add new method ``QPDF::setOutputStreams`` to replace ``std::cout``
+ and ``std::cerr`` with other streams for generation of diagnostic
+ messages and error messages. This can be useful for GUIs or other
+ applications that want to capture any output generated by the
+ library to present to the user in some other way. Note that QPDF
+ does not write to ``std::cout`` (or the specified output stream)
+ except where explicitly mentioned in
+ @1@filename@1@QPDF.hh@2@filename@2@, and that the only use of the
+ error stream is for warnings. Note also that output of warnings is
+ suppressed when ``setSuppressWarnings(true)`` is called.
+
+ - Add new method ``QPDF::processMemoryFile`` for operating on PDF
+ files that are loaded into memory rather than in a file on disk.
+
+ - Give a warning but otherwise ignore empty PDF objects by treating
+ them as null. Empty object are not permitted by the PDF
+ specification but have been known to appear in some actual PDF
+ files.
+
+ - Handle inline image filter abbreviations when the appear as stream
+ filter abbreviations. The PDF specification does not allow use of
+ stream filter abbreviations in this way, but Adobe Reader and some
+ other PDF readers accept them since they sometimes appear
+ incorrectly in actual PDF files.
+
+ - Implement miscellaneous enhancements to ``PointerHolder`` and
+ ``Buffer`` to support other changes.
+
+2.2.0: August 14, 2010
+ - Add new methods to ``QPDFObjectHandle`` (``newStream`` and
+ ``replaceStreamData`` for creating new streams and replacing
+ stream data. This makes it possible to perform a wide range of
+ operations that were not previously possible.
+
+ - Add new helper method in ``QPDFObjectHandle``
+ (``addPageContents``) for appending or prepending new content
+ streams to a page. This method makes it possible to manipulate
+ content streams without having to be concerned whether a page's
+ contents are a single stream or an array of streams.
+
+ - Add new method in ``QPDFObjectHandle``: ``replaceOrRemoveKey``,
+ which replaces a dictionary key with a given value unless the
+ value is null, in which case it removes the key instead.
+
+ - Add new method in ``QPDFObjectHandle``: ``getRawStreamData``,
+ which returns the raw (unfiltered) stream data into a buffer. This
+ complements the ``getStreamData`` method, which returns the
+ filtered (uncompressed) stream data and can only be used when the
+ stream's data is filterable.
+
+ - Provide two new examples:
+ @1@command@1@pdf-double-page-size@2@command@2@ and
+ @1@command@1@pdf-invert-images@2@command@2@ that illustrate the
+ newly added interfaces.
+
+ - Fix a memory leak that would cause loss of a few bytes for every
+ object involved in a cycle of object references. Thanks to Jian Ma
+ for calling my attention to the leak.
+
+2.1.5: April 25, 2010
+ - Remove restriction of file identifier strings to 16 bytes. This
+ unnecessary restriction was preventing qpdf from being able to
+ encrypt or decrypt files with identifier strings that were not
+ exactly 16 bytes long. The specification imposes no such
+ restriction.
+
+2.1.4: April 18, 2010
+ - Apply the same padding calculation fix from version 2.1.2 to the
+ main cross reference stream as well.
+
+ - Since @1@command@1@qpdf --check@2@command@2@ only performs limited
+ checks, clarify the output to make it clear that there still may
+ be errors that qpdf can't check. This should make it less
+ surprising to people when another PDF reader is unable to read a
+ file that qpdf thinks is okay.
+
+2.1.3: March 27, 2010
+ - Fix bug that could cause a failure when rewriting PDF files that
+ contain object streams with unreferenced objects that in turn
+ reference indirect scalars.
+
+ - Don't complain about (invalid) AES streams that aren't a multiple
+ of 16 bytes. Instead, pad them before decrypting.
+
+2.1.2: January 24, 2010
+ - Fix bug in padding around first half cross reference stream in
+ linearized files. The bug could cause an assertion failure when
+ linearizing certain unlucky files.
+
+2.1.1: December 14, 2009
+ - No changes in functionality; insert missing include in an internal
+ library header file to support gcc 4.4, and update test suite to
+ ignore broken Adobe Reader installations.
+
+2.1: October 30, 2009
+ - This is the first version of qpdf to include Windows support. On
+ Windows, it is possible to build a DLL. Additionally, a partial
+ C-language API has been introduced, which makes it possible to
+ call qpdf functions from non-C++ environments. I am very grateful
+ to Žarko Gajić (http://zarko-gajic.iz.hr/) for tirelessly testing
+ numerous pre-release versions of this DLL and providing many
+ excellent suggestions on improving the interface.
+
+ For programming to the C interface, please see the header file
+ @1@filename@1@qpdf/qpdf-c.h@2@filename@2@ and the example
+ @1@filename@1@examples/pdf-linearize.c@2@filename@2@.
+
+ - Žarko Gajić has written a Delphi wrapper for qpdf, which can be
+ downloaded from qpdf's download side. Žarko's Delphi wrapper is
+ released with the same licensing terms as qpdf itself and comes
+ with this disclaimer: "Delphi wrapper unit
+ @1@filename@1@qpdf.pas@2@filename@2@ created by Žarko Gajić
+ (http://zarko-gajic.iz.hr/). Use at your own risk and for whatever
+ purpose you want. No support is provided. Sample code is
+ provided."
+
+ - Support has been added for AES encryption and crypt filters.
+ Although qpdf does not presently support files that use PKI-based
+ encryption, with the addition of AES and crypt filters, qpdf is
+ now be able to open most encrypted files created with newer
+ versions of Acrobat or other PDF creation software. Note that I
+ have not been able to get very many files encrypted in this way,
+ so it's possible there could still be some cases that qpdf can't
+ handle. Please report them if you find them.
+
+ - Many error messages have been improved to include more information
+ in hopes of making qpdf a more useful tool for PDF experts to use
+ in manually recovering damaged PDF files.
+
+ - Attempt to avoid compressing metadata streams if possible. This is
+ consistent with other PDF creation applications.
+
+ - Provide new command-line options for AES encrypt, cleartext
+ metadata, and setting the minimum and forced PDF versions of
+ output files.
+
+ - Add additional methods to the ``QPDF`` object for querying the
+ document's permissions. Although qpdf does not enforce these
+ permissions, it does make them available so that applications that
+ use qpdf can enforce permissions.
+
+ - The @1@option@1@--check@2@option@2@ option to
+ @1@command@1@qpdf@2@command@2@ has been extended to include some
+ additional information.
+
+ - There have been a handful of non-compatible API changes. For
+ details, see `appendix_title <#ref.upgrading-to-2.1>`__.
+
+2.0.6: May 3, 2009
+ - Do not attempt to uncompress streams that have decode parameters
+ we don't recognize. Earlier versions of qpdf would have rejected
+ files with such streams.
+
+2.0.5: March 10, 2009
+ - Improve error handling in the LZW decoder, and fix a small error
+ introduced in the previous version with regard to handling full
+ tables. The LZW decoder has been more strongly verified in this
+ release.
+
+2.0.4: February 21, 2009
+ - Include proper support for LZW streams encoded without the "early
+ code change" flag. Special thanks to Atom Smasher who reported the
+ problem and provided an input file compressed in this way, which I
+ did not previously have.
+
+ - Implement some improvements to file recovery logic.
+
+2.0.3: February 15, 2009
+ - Compile cleanly with gcc 4.4.
+
+ - Handle strings encoded as UTF-16BE properly.
+
+2.0.2: June 30, 2008
+ - Update test suite to work properly with a
+ non-@1@command@1@bash@2@command@2@
+ @1@filename@1@/bin/sh@2@filename@2@ and with Perl 5.10. No changes
+ were made to the actual qpdf source code itself for this release.
+
+2.0.1: May 6, 2008
+ - No changes in functionality or interface. This release includes
+ fixes to the source code so that qpdf compiles properly and passes
+ its test suite on a broader range of platforms. See
+ @1@filename@1@ChangeLog@2@filename@2@ in the source distribution
+ for details.
+
+2.0: April 29, 2008
+ - First public release.
+
+.. _ref.upgrading-to-2.1:
+
+Upgrading from 2.0 to 2.1
+=========================
+
+Although, as a general rule, we like to avoid introducing source-level
+incompatibilities in qpdf's interface, there were a few non-compatible
+changes made in this version. A considerable amount of source code that
+uses qpdf will probably compile without any changes, but in some cases,
+you may have to update your code. The changes are enumerated here. There
+are also some new interfaces; for those, please refer to the header
+files.
+
+- QPDF's exception handling mechanism now uses ``std::logic_error`` for
+ internal errors and ``std::runtime_error`` for runtime errors in
+ favor of the now removed ``QEXC`` classes used in previous versions.
+ The ``QEXC`` exception classes predated the addition of the
+ @1@filename@1@<stdexcept>@2@filename@2@ header file to the C++
+ standard library. Most of the exceptions thrown by the qpdf library
+ itself are still of type ``QPDFExc`` which is now derived from
+ ``std::runtime_error``. Programs that caught an instance of
+ ``std::exception`` and displayed it by calling the ``what()`` method
+ will not need to be changed.
+
+- The ``QPDFExc`` class now internally represents various fields of the
+ error condition and provides interfaces for querying them. Among the
+ fields is a numeric error code that can help applications act
+ differently on (a small number of) different error conditions. See
+ @1@filename@1@QPDFExc.hh@2@filename@2@ for details.
+
+- Warnings can be retrieved from qpdf as instances of ``QPDFExc``
+ instead of strings.
+
+- The nested ``QPDF::EncryptionData`` class's constructor takes an
+ additional argument. This class is primarily intended to be used by
+ ``QPDFWriter``. There's not really anything useful an end-user
+ application could do with it. It probably shouldn't really be part of
+ the public interface to begin with. Likewise, some of the methods for
+ computing internal encryption dictionary parameters have changed to
+ support ``/R=4`` encryption.
+
+- The method ``QPDF::getUserPassword`` has been removed since it didn't
+ do what people would think it did. There are now two new methods:
+ ``QPDF::getPaddedUserPassword`` and ``QPDF::getTrimmedUserPassword``.
+ The first one does what the old ``QPDF::getUserPassword`` method used
+ to do, which is to return the password with possible binary padding
+ as specified by the PDF specification. The second one returns a
+ human-readable password string.
+
+- The enumerated types that used to be nested in ``QPDFWriter`` have
+ moved to top-level enumerated types and are now defined in the file
+ @1@filename@1@qpdf/Constants.h@2@filename@2@. This enables them to be
+ shared by both the C and C++ interfaces.
+
+.. _ref.upgrading-to-3.0:
+
+Upgrading to 3.0
+================
+
+For the most part, the API for qpdf version 3.0 is backward compatible
+with versions 2.1 and later. There are two exceptions:
+
+- The method ``QPDFObjectHandle::replaceStreamData`` that uses a
+ ``StreamDataProvider`` to provide the stream data no longer takes a
+ ``length`` parameter. While it would have been easy enough to keep
+ the parameter for backward compatibility, in this case, the parameter
+ was removed since this provides the user an opportunity to simplify
+ the calling code. This method was introduced in version 2.2. At the
+ time, the ``length`` parameter was required in order to ensure that
+ calls to the stream data provider returned the same length for a
+ specific stream every time they were invoked. In particular, the
+ linearization code depends on this. Instead, qpdf 3.0 and newer check
+ for that constraint explicitly. The first time the stream data
+ provider is called for a specific stream, the actual length is saved,
+ and subsequent calls are required to return the same number of bytes.
+ This means the calling code no longer has to compute the length in
+ advance, which can be a significant simplification. If your code
+ fails to compile because of the extra argument and you don't want to
+ make other changes to your code, just omit the argument.
+
+- Many methods take ``long long`` instead of other integer types. Most
+ if not all existing code should compile fine with this change since
+ such parameters had always previously been smaller types. This change
+ was required to support files larger than two gigabytes in size.
+
+.. _ref.upgrading-to-4.0:
+
+Upgrading to 4.0
+================
+
+While version 4.0 includes a few non-compatible API changes, it is very
+unlikely that anyone's code would have used any of those parts of the
+API since they generally required information that would only be
+available inside the library. In the unlikely event that you should run
+into trouble, please see the ChangeLog. See also
+`appendix_title <#ref.release-notes>`__ for a complete list of the
+non-compatible API changes made in this version.
+
+
+
Indices and tables
==================