aboutsummaryrefslogtreecommitdiffstats
path: root/ChangeLog
diff options
context:
space:
mode:
authorJay Berkenbilt <ejb@ql.org>2018-02-03 00:21:34 +0100
committerJay Berkenbilt <ejb@ql.org>2018-02-19 03:05:46 +0100
commit99101044429c3c91bd11bdd1b26e5b6c2ceb140b (patch)
tree5ab366eab31ddf76e80f99bd1d34c421291f1c4e /ChangeLog
parentb8723e97f4b94fe03e631aab0309382ead3137ed (diff)
downloadqpdf-99101044429c3c91bd11bdd1b26e5b6c2ceb140b.tar.zst
Implement TokenFilter and refactor Pl_QPDFTokenizer
Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a general filter that passes data through a TokenFilter.
Diffstat (limited to 'ChangeLog')
-rw-r--r--ChangeLog43
1 files changed, 43 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog
index 256d83ea..20cb0e80 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -107,6 +107,49 @@
applications that use page-level APIs in QPDFObjectHandle to be
more tolerant of certain types of damaged files.
+ * Add QPDFObjectHandle::TokenFilter class and methods to use it to
+ perform lexical filtering on content streams. You can call
+ QPDFObjectHandle::addTokenFilter on stream object, or you can call
+ the higher level QPDFObjectHandle::addContentTokenFilter on a page
+ object to cause the stream's contents to passed through a token
+ filter while being retrieved by QPDFWriter or any other consumer.
+ For details on using TokenFilter, please see comments in
+ QPDFObjectHandle.hh.
+
+ * Enhance the string, type QPDFTokenizer::Token constructor to
+ initialize a raw value in addition to a value. Tokens have a
+ value, which is a canonical representation, and a raw value. For
+ all tokens except strings and names, the raw value and the value
+ are the same. For strings, the value excludes the outer delimiters
+ and has non-printing characters normalized. For names, the value
+ resolves non-printing characters. In order to better facilitate
+ token filters that mostly preserve contents and to enable
+ developers to be mostly unconcerned about the nuances of token
+ values and raw values, creating string and name tokens now
+ properly handles this subtlety of values and raw values. When
+ constructing string tokens, take care to avoid passing in the
+ outer delimiters. This has always been the case, but it is now
+ clarified in comments in QPDFObjectHandle.hh::TokenFilter. This
+ has no impact on any existing code unless there's some code
+ somewhere that was relying on Token::getRawValue() returning an
+ empty string for a manually constructed token. The token class's
+ operator== method still only looks at type and value, not raw
+ value. For example, string tokens for <41> and (A) would still be
+ equal because both are representations of the string "A".
+
+ * Add QPDFObjectHandle::isDataModified method. This method just
+ returns true if addTokenFilter has been called on the stream. It
+ enables a caller to determine whether it is safe to optimize away
+ piping of stream data in cases where the input and output are
+ expected to be the same. QPDFWriter uses this internally to skip
+ the optimization of not re-compressing already compressed streams
+ if addTokenFilter has been called. Most developers will not have
+ to worry about this as it is used internally in the library in the
+ places that need it. If you are manually retrieving stream data
+ with QPDFObjectHandle::getStreamData or
+ QPDFObjectHandle::pipeStreamData, you don't need to worry about
+ this at all.
+
2018-02-04 Jay Berkenbilt <ejb@ql.org>
* Add QPDFWriter::setLinearizationPass1Filename method and