aboutsummaryrefslogtreecommitdiffstats
path: root/manual
diff options
context:
space:
mode:
authorJay Berkenbilt <ejb@ql.org>2018-02-21 03:12:55 +0100
committerJay Berkenbilt <ejb@ql.org>2018-02-21 03:13:08 +0100
commite429a2e17053d16efc5b9bcb61c22221e5075765 (patch)
treee7bed8d2cb0438dad333b653e4f12a0c561a86e4 /manual
parent30380b64e37b275854553668a4fa32be7fc4a11d (diff)
downloadqpdf-e429a2e17053d16efc5b9bcb61c22221e5075765.tar.zst
Describe content normalization edge cases in manual
Diffstat (limited to 'manual')
-rw-r--r--manual/qpdf-manual.xml35
1 files changed, 34 insertions, 1 deletions
diff --git a/manual/qpdf-manual.xml b/manual/qpdf-manual.xml
index 3595058b..1a29229b 100644
--- a/manual/qpdf-manual.xml
+++ b/manual/qpdf-manual.xml
@@ -1050,7 +1050,10 @@ outfile.pdf</option>
<term><option>--normalize-content=[yn]</option></term>
<listitem>
<para>
- Enables or disables normalization of content streams.
+ Enables or disables normalization of content streams. Content
+ normalization is enabled by default in QDF mode. Please see
+ <xref linkend="ref.qdf"/> for additional discussion of QDF
+ mode.
</para>
</listitem>
</varlistentry>
@@ -1206,6 +1209,36 @@ outfile.pdf</option>
You should not use this for &ldquo;production&rdquo; PDF files.
</para>
<para>
+ This paragraph discusses edge cases of content normalization that
+ are not of concern to most users and are not relevant when content
+ normalization is not enabled. When normalizing content, if qpdf
+ runs into any lexical errors, it will print a warning indicating
+ that content may be damaged. The only situation in which qpdf is
+ known to cause damage during content normalization is when a
+ page's contents are split across multiple streams and streams are
+ split in the middle of a lexical token such as a string, name, or
+ inline image. There may be some pathological cases in which qpdf
+ could damage content without noticing this, such as if the partial
+ tokens at the end of one stream and the beginning of the next
+ stream are both valid, but usually qpdf will be able to detect
+ this case. For slightly increased safety, you can specify
+ <option>--coalesce-contents</option> in addition to
+ <option>--normalize-content</option> or <option>--qdf</option>.
+ This will cause qpdf to combine all the content streams into one,
+ thus recombining any split tokens. However doing this will prevent
+ you from being able to see the original layout of the content
+ streams. If you must inspect the original content streams in an
+ uncompressed format, you can always run with <option>--qdf
+ --normalize-content=n</option> for a QDF file without content
+ normalization, or alternatively
+ <option>--stream-data=uncompress</option> for a regular non-QDF
+ mode file with uncompressed streams. These will both uncompress
+ all the streams but will not attempt to normalize content. Please
+ note that if you are using content normalization or QDF mode for
+ the purpose of manually inspecting files, you don't have to care
+ about this.
+ </para>
+ <para>
Object streams, also known as compressed objects, were introduced
into the PDF specification at version 1.5, corresponding to
Acrobat 6. Some older PDF viewers may not support files with