aboutsummaryrefslogtreecommitdiffstats
path: root/ChangeLog
diff options
context:
space:
mode:
authorJay Berkenbilt <ejb@ql.org>2018-01-30 02:57:04 +0100
committerJay Berkenbilt <ejb@ql.org>2018-02-19 03:05:46 +0100
commitfefe25030eaffdaf06a9e957b3255304682c71cf (patch)
treefa404200db521e085a711fa13f6952469665ae8e /ChangeLog
parent2699ecf13e8559b136ded1986bf18e1a0a51011f (diff)
downloadqpdf-fefe25030eaffdaf06a9e957b3255304682c71cf.tar.zst
Inline image token type
Diffstat (limited to 'ChangeLog')
-rw-r--r--ChangeLog71
1 files changed, 43 insertions, 28 deletions
diff --git a/ChangeLog b/ChangeLog
index e95e2370..e9dea347 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,34 +1,49 @@
2018-02-04 Jay Berkenbilt <ejb@ql.org>
* Significant lexer (tokenizer) enhancements. These are changes to
- the QPDFTokenizer class. These changes are of concern only to
- people who are operating with PDF files at the lexical layer
- using qpdf. They have little or no impact on most high-level
- interfaces or the command-line tool.
- * New token types tt_space and tt_comment to recognize
- whitespace and comments. this makes it possible to tokenize a
- PDF file or stream and preserve everything about it.
- * For backward compatibility, space and comment tokens are not
- returned by the tokenizer unless
- QPDFTokenizer.includeIgnorable() is called.
- * Better handling of null bytes. These are now included in space
- tokens rather than being their own "tt_word" tokens. This
- should have no impact on any correct PDF file and has no
- impact on output, but it may change offsets in some error
- messages when trying to parse contents of bad files. Under
- default operation, qpdf does not attempt to parse content
- streams, so this change is mostly invisible.
- * Bug fix to handling of bad tokens at ends of streams. Now,
- when allowEOF() has been called, these are treated as bad tokens
- (tt_bad or an exception, depending on invocation), and a
- separate tt_eof token is returned. Before the bad token
- contents were returned as the value of a tt_eof token. tt_eof
- tokens are always empty now.
- * Fix a bug that would, on rare occasions, report the offset in an
- error message in the wrong space because of spaces or comments
- adjacent to a bad token.
- * Clarify in comments exactly where the input source is
- positioned surrounding calls to readToken and getToken.
+ the QPDFTokenizer class. These changes are of concern only to
+ people who are operating with PDF files at the lexical layer using
+ qpdf. They have little or no impact on most high-level interfaces
+ or the command-line tool.
+
+ New token types tt_space and tt_comment to recognize whitespace
+ and comments. this makes it possible to tokenize a PDF file or
+ stream and preserve everything about it.
+
+ For backward compatibility, space and comment tokens are not
+ returned by the tokenizer unless QPDFTokenizer.includeIgnorable()
+ is called.
+
+ Better handling of null bytes. These are now included in space
+ tokens rather than being their own "tt_word" tokens. This should
+ have no impact on any correct PDF file and has no impact on
+ output, but it may change offsets in some error messages when
+ trying to parse contents of bad files. Under default operation,
+ qpdf does not attempt to parse content streams, so this change is
+ mostly invisible.
+
+ Bug fix to handling of bad tokens at ends of streams. Now, when
+ allowEOF() has been called, these are treated as bad tokens
+ (tt_bad or an exception, depending on invocation), and a
+ separate tt_eof token is returned. Before the bad token
+ contents were returned as the value of a tt_eof token. tt_eof
+ tokens are always empty now.
+
+ Fix a bug that would, on rare occasions, report the offset in an
+ error message in the wrong space because of spaces or comments
+ adjacent to a bad token.
+
+ Clarify in comments exactly where the input source is positioned
+ surrounding calls to readToken and getToken.
+
+ * Add a new token type for inline images. This token type is only
+ returned by QPDFTokenizer immediately following a call to
+ expectInlineImage(). This change includes internal refactoring of
+ a handful of places that all separately handled inline images, The
+ logic of detecting inline images in content streams is now handled
+ in one place in the code. Also we are more flexible about what
+ characters may surround the EI operator that marks the end of an
+ inline image.
2018-02-04 Jay Berkenbilt <ejb@ql.org>