aboutsummaryrefslogtreecommitdiffstats
path: root/qpdf
diff options
context:
space:
mode:
authorJay Berkenbilt <ejb@ql.org>2022-09-26 14:05:28 +0200
committerJay Berkenbilt <ejb@ql.org>2022-09-26 14:06:47 +0200
commitf4ca04cec1a0c4a3c8341ff15f68c06bed89c0d7 (patch)
tree4699cc60ca8e4779db4635a7342f4ff9dfffceb1 /qpdf
parent4fb7d1335a4660bb8748773294f2dea979fcdbb7 (diff)
downloadqpdf-f4ca04cec1a0c4a3c8341ff15f68c06bed89c0d7.tar.zst
Fix edge case in character encoding (fixes #778)
Avoid representing as PDF Doc encoding any string whose PDF Doc encoding representation starts with a UTF-16 or UTF-8 marker.
Diffstat (limited to 'qpdf')
-rw-r--r--qpdf/qtest/qpdf/unicode.in5
-rw-r--r--qpdf/qtest/qpdf/unicode.out5
2 files changed, 10 insertions, 0 deletions
diff --git a/qpdf/qtest/qpdf/unicode.in b/qpdf/qtest/qpdf/unicode.in
index 2984b5f3..1ddf1178 100644
--- a/qpdf/qtest/qpdf/unicode.in
+++ b/qpdf/qtest/qpdf/unicode.in
@@ -5,3 +5,8 @@ If you think wwwwww is good, you should try ʬʬʬʬʬʬ.
𝄞 𝄢 𝄪 𝅂
This can be encoded in ASCII.
This can be encoded in PDFDocEncoding (€).
+þÿ -- PDFDoc would look like UTF-16-BE
+ÿþ -- PDFDoc would look like UTF-16-LE
+ -- PDFDoc would look like UTF-8
+ï»» -- PDFDoc okay
+þþ -- PDFDoc okay
diff --git a/qpdf/qtest/qpdf/unicode.out b/qpdf/qtest/qpdf/unicode.out
index c1901585..4f8ee322 100644
--- a/qpdf/qtest/qpdf/unicode.out
+++ b/qpdf/qtest/qpdf/unicode.out
@@ -5,3 +5,8 @@ If you think wwwwww is good, you should try ʬʬʬʬʬʬ. // <feff00490066002000
𝄞 𝄢 𝄪 𝅂 // <feffd834dd1e0020d834dd220020d834dd2a0020d834dd42>
This can be encoded in ASCII. // <546869732063616e20626520656e636f64656420696e2041534349492e>
This can be encoded in PDFDocEncoding (€). // <546869732063616e20626520656e636f64656420696e20504446446f63456e636f64696e672028a0292e>
+þÿ -- PDFDoc would look like UTF-16-BE // <feff00fe00ff0020002d002d00200050004400460044006f006300200077006f0075006c00640020006c006f006f006b0020006c0069006b00650020005500540046002d00310036002d00420045>
+ÿþ -- PDFDoc would look like UTF-16-LE // <feff00ff00fe0020002d002d00200050004400460044006f006300200077006f0075006c00640020006c006f006f006b0020006c0069006b00650020005500540046002d00310036002d004c0045>
+ -- PDFDoc would look like UTF-8 // <feff00ef00bb00bf0020002d002d00200050004400460044006f006300200077006f0075006c00640020006c006f006f006b0020006c0069006b00650020005500540046002d0038>
+ï»» -- PDFDoc okay // <efbbbb202d2d20504446446f63206f6b6179>
+þþ -- PDFDoc okay // <fefe202d2d20504446446f63206f6b6179>