aboutsummaryrefslogtreecommitdiffstats
path: root/manual/qdf.rst
diff options
context:
space:
mode:
authorJay Berkenbilt <ejb@ql.org>2021-12-18 15:01:52 +0100
committerJay Berkenbilt <ejb@ql.org>2021-12-18 17:05:51 +0100
commit10fb619d3e0618528b7ac6c20cad6262020cf947 (patch)
treec893fedff351e809edead840376e8648f1cc28ff /manual/qdf.rst
parentf3d1138b8ab64c6a26e1dd5f77a644b19016a30d (diff)
downloadqpdf-10fb619d3e0618528b7ac6c20cad6262020cf947.tar.zst
Split documentation into multiple pages, change theme
Diffstat (limited to 'manual/qdf.rst')
-rw-r--r--manual/qdf.rst96
1 files changed, 96 insertions, 0 deletions
diff --git a/manual/qdf.rst b/manual/qdf.rst
new file mode 100644
index 00000000..b7ee7813
--- /dev/null
+++ b/manual/qdf.rst
@@ -0,0 +1,96 @@
+.. _ref.qdf:
+
+QDF Mode
+========
+
+In QDF mode, qpdf creates PDF files in what we call *QDF
+form*. A PDF file in QDF form, sometimes called a QDF
+file, is a completely valid PDF file that has ``%QDF-1.0`` as its third
+line (after the pdf header and binary characters) and has certain other
+characteristics. The purpose of QDF form is to make it possible to edit
+PDF files, with some restrictions, in an ordinary text editor. This can
+be very useful for experimenting with different PDF constructs or for
+making one-off edits to PDF files (though there are other reasons why
+this may not always work). Note that QDF mode does not support
+linearized files. If you enable linearization, QDF mode is automatically
+disabled.
+
+It is ordinarily very difficult to edit PDF files in a text editor for
+two reasons: most meaningful data in PDF files is compressed, and PDF
+files are full of offset and length information that makes it hard to
+add or remove data. A QDF file is organized in a manner such that, if
+edits are kept within certain constraints, the
+:command:`fix-qdf` program, distributed with qpdf, is
+able to restore edited files to a correct state. The
+:command:`fix-qdf` program takes no command-line
+arguments. It reads a possibly edited QDF file from standard input and
+writes a repaired file to standard output.
+
+The following attributes characterize a QDF file:
+
+- All objects appear in numerical order in the PDF file, including when
+ objects appear in object streams.
+
+- Objects are printed in an easy-to-read format, and all line endings
+ are normalized to UNIX line endings.
+
+- Unless specifically overridden, streams appear uncompressed (when
+ qpdf supports the filters and they are compressed with a non-lossy
+ compression scheme), and most content streams are normalized (line
+ endings are converted to just a UNIX-style linefeeds).
+
+- All streams lengths are represented as indirect objects, and the
+ stream length object is always the next object after the stream. If
+ the stream data does not end with a newline, an extra newline is
+ inserted, and a special comment appears after the stream indicating
+ that this has been done.
+
+- If the PDF file contains object streams, if object stream *n*
+ contains *k* objects, those objects are numbered from *n+1* through
+ *n+k*, and the object number/offset pairs appear on a separate line
+ for each object. Additionally, each object in the object stream is
+ preceded by a comment indicating its object number and index. This
+ makes it very easy to find objects in object streams.
+
+- All beginnings of objects, ``stream`` tokens, ``endstream`` tokens,
+ and ``endobj`` tokens appear on lines by themselves. A blank line
+ follows every ``endobj`` token.
+
+- If there is a cross-reference stream, it is unfiltered.
+
+- Page dictionaries and page content streams are marked with special
+ comments that make them easy to find.
+
+- Comments precede each object indicating the object number of the
+ corresponding object in the original file.
+
+When editing a QDF file, any edits can be made as long as the above
+constraints are maintained. This means that you can freely edit a page's
+content without worrying about messing up the QDF file. It is also
+possible to add new objects so long as those objects are added after the
+last object in the file or subsequent objects are renumbered. If a QDF
+file has object streams in it, you can always add the new objects before
+the xref stream and then change the number of the xref stream, since
+nothing generally ever references it by number.
+
+It is not generally practical to remove objects from QDF files without
+messing up object numbering, but if you remove all references to an
+object, you can run qpdf on the file (after running
+:command:`fix-qdf`), and qpdf will omit the now-orphaned
+object.
+
+When :command:`fix-qdf` is run, it goes through the file
+and recomputes the following parts of the file:
+
+- the ``/N``, ``/W``, and ``/First`` keys of all object stream
+ dictionaries
+
+- the pairs of numbers representing object numbers and offsets of
+ objects in object streams
+
+- all stream lengths
+
+- the cross-reference table or cross-reference stream
+
+- the offset to the cross-reference table or cross-reference stream
+ following the ``startxref`` token