1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
|
2019-01-05 Jay Berkenbilt <ejb@ql.org>
* When generating appearances, if the font uses one of the
standard, built-in encodings, restrict the character set to that
rather than just to ASCII. This will allow most appearances to
contain characters from the ISO-Latin-1 range plus a few
additional characters.
* Add methods QUtil::utf8_to_win_ansi and
QUtil::utf8_to_mac_roman.
* Add method QUtil::utf8_to_utf16.
2019-01-04 Jay Berkenbilt <ejb@ql.org>
* Add new option --optimize-images, which recompresses every image
using DCT (JPEG) compression as long as the image is not already
compressed with lossy compression and recompressing the image
reduces its size. The additional options --oi-min-width,
--oi-min-height, and --oi-min-area prevent recompression of images
whose width, height, or pixel area (width * height) are below a
specified threshold.
* Add new option --collate. When specified, the semantics of
--pages change from concatenation to collation. See the manual for
a more detailed discussion. Fixes #259.
* Add new method QPDFWriter::getFinalVersion, which returns the
PDF version that will ultimately be written to the final file. See
comments in QPDFWriter.hh for some restrictions on its use. Fixes
#266.
* When unexpected errors are found while checking linearization
data, print an error message instead of calling assert, which
cause the program to crash. Fixes #209, #231.
* Detect and recover from dangling references. If a PDF file
contained an indirect reference to a non-existent object (which is
valid), when adding a new object to the file, it was possible for
the new object to take the object ID of the dangling reference,
thereby causing the dangling reference to point to the new object.
This case is now prevented. Fixes #240.
2019-01-03 Jay Berkenbilt <ejb@ql.org>
* Fix behavior of form field value setting to handle the following
cases:
- Strings are always written as UTF-16
- Check boxes and radio buttons are handled properly with
synchronization of values and appearance states
* Define constants in qpdf/Constants.h for interpretation of
annotation and form field flags
* Add QPDFAnnotationObjectHelper::getFlags
* Add many new methods to QPDFFormFieldObjectHelper for querying
flags and field types
* Add new methods for appearance stream generation. See comments
in QPDFFormFieldObjectHelper.hh for generateAppearance() for a
description of limitations.
- QPDFAcroFormDocumentHelper::generateAppearancesIfNeeded
- QPDFFormFieldObjectHelper::generateAppearance
* Bug fix: when writing form field values, always write string
values encoded as UTF-16.
* Add method QUtil::utf8_to_ascii, which returns an ASCII string
for a UTF-8 string, replacing out-of-range characters with a
specified substitute.
2019-01-02 Jay Berkenbilt <ejb@ql.org>
* Add method QPDFObjectHandle::getResourceNames that returns a set
of strings representing all second-level keys in a dictionary
(i.e. all keys of all direct dictionary members).
2018-12-31 Jay Berkenbilt <ejb@ql.org>
* Add --flatten-annotations flag to the qpdf command-line tool for
annotation flattening.
* Add methods for flattening form fields and annotations:
- QPDFPageDocumentHelper::flattenAnnotations - integrate
annotation appearance streams into page contents with special
handling for form fields: if appearance streams are up to date
(/NeedAppearances is false in /AcroForm), the /AcroForm key of
the document catalog is removed. Otherwise, a warning is
issued, and form fields are ignored. Non-form-field
annotations are always flattened if an appearance stream can
be found.
- QPDFAnnotationObjectHelper::getPageContentForAppearance -
generate the content stream fragment to render an appearance
stream in a page's content stream as a form xobject. Called by
flattenAnnotations.
* Add method QPDFObjectHandle::mergeResources(), which merges
resource dictionaries. See detailed description in
QPDFObjectHandle.hh.
* Add QPDFObjectHandle::Matrix, similar to
QPDFObjectHandle::Rectangle, as a convenience class for
six-element arrays that are used as matrices.
2018-12-23 Jay Berkenbilt <ejb@ql.org>
* When specifying @arg on the command line, if the file "arg" does
not exist, just treat this is a normal argument. This makes it
easier to deal with files whose names start with the @ character.
Fixes #265.
* Tweak completion so it works with zsh as well using
bashcompinit.
2018-12-22 Jay Berkenbilt <ejb@ql.org>
* Add new options --json, --json-key, and --json-object to
generate a json representation of the PDF file. This is described
in more depth in the manual. You can also run qpdf --json-help to
get a description of the json format.
2018-12-21 Jay Berkenbilt <ejb@ql.org>
* Allow --show-object=trailer for showing the document trailer.
* You can now use eval $(qpdf --completion-bash) to enable bash
completion for qpdf. It's not perfect, but it works pretty well.
2018-12-19 Jay Berkenbilt <ejb@ql.org>
* When splitting pages using --split-pages, the outlines
dictionary and some supporting metadata are copied into the split
files. The result is that all bookmarks from the original file
appear, and those that point to pages that are preserved work
while those that point to pages that are not preserved don't do
anything. This is an interim step toward proper support for
bookmark preservation in split files.
* Add QPDFOutlineDocumentHelper and QPDFOutlineObjectHelper for
handling outlines (bookmarks) including bidirectionally mapping
between bookmarks and pages. Initially there is no support for
modifying the outlines hierarchy.
2018-12-18 Jay Berkenbilt <ejb@ql.org>
* New method QPDFObjectHandle::getJSON() returns a JSON object
with a partial representation of the object. See
QPDFObjectHandle.hh for a detailed description.
* Add a simple JSON serializer. This is not a complete or
general-purpose JSON library. It allows assembly and serialization
of JSON structures with some restrictions, which are described in
the header file.
* Add QPDFNameTreeObjectHelper class. This class provides useful
methods for dealing with name trees, which are discussed in
section 7.9.6 of the PDF spec (ISO-32000).
* Preserve page labels when merging and splitting files. Prior
versions of qpdf simply preserved the page label information from
the first file, which usually wouldn't make any sense in the
merged file. Now any page that had a page number in any original
file will have the same page number after merging or splitting.
* Add QPDFPageLabelDocumentHelper class. This is a document helper
class that provides useful methods for dealing with page labels.
It abstracts the fact that they are stored as number trees and
deals with interpolating intermediate values that are not in the
tree. It also has helper functions used by the qpdf command line
tool to preserve page labels when merging and splitting files.
* Add QPDFNumberTreeObjectHelper class. This class provides useful
methods for dealing with number trees, which are discussed in
section 7.9.7 of the PDF spec (ISO-32000). Page label dictionaries
are represented as number trees.
* New method QPDFObjectHandle::wrapInArray returns the object
itself if it is an array. Otherwise, it returns an array
containing the object. This is useful for dealing with PDF data
that is sometimes expressed as a single element and sometimes
expressed as an array, which is a somewhat common PDF idiom.
2018-10-11 Jay Berkenbilt <ejb@ql.org>
* Files generated by autogen.sh are now committed so that it is
possible to build on platforms without autoconf directly from a
clean checkout of the repository. The configure script detects if
the files are out of date when it also determines that the tools
are present to regenerate them.
* Add build in Azure Pipelines, now that it is free for open
source projects.
2018-08-18 Jay Berkenbilt <ejb@ql.org>
* 8.2.1: release
* Add new option --keep-files-open=[yn] to control whether qpdf
keeps files open when merging. Prior to version 8.1.0, qpdf always
kept all files open, but this meant that the number of files that
could be merged was limited by the operating system's open file
limit. Version 8.1.0 opened files as they were referenced, but
this caused a major performance impact. Version 8.2.0 optimized
the performance but did so in a way that, for local file systems,
there was a small but unavoidable performance hit, but for
networked file systems, the performance impact could be very high.
Starting with version 8.2.1, the default behavior is that files
are kept open if no more than 200 files are specified, but that
the behavior can be explicitly overridden with the
--keep-files-open flag. If you are merging more than 200 files but
less than the operating system's max open files limit, you may
want to use --keep-files-open=y. If you are using a local file
system where the overhead is low and you might sometimes merge
more than the OS limit's number of files, you may want to specify
--keep-files-open=n. Fixes #237.
2018-08-16 Jay Berkenbilt <ejb@ql.org>
* 8.2.0: release
2018-08-14 Jay Berkenbilt <ejb@ql.org>
* For the mingw builds, change the name of the DLL import library
from libqpdf.a to libqpdf.dll.a to avoid confusing it with a
static library. This potentially clears the way for supporting a
static library in the future, though presently, the qpdf Windows
build only builds the DLL and executables. Fixes #225.
2018-08-13 Jay Berkenbilt <ejb@ql.org>
* Add new class QPDFSystemError, derived from std::runtime_error,
which is now thrown by QUtil::throw_system_error. This enables the
triggering errno value to be retrieved. Fixes #221.
2018-08-12 Jay Berkenbilt <ejb@ql.org>
* qpdf command line: add --no-warn option to suppress issuing
warning messages. If there are any conditions that would have
caused warnings to be issued, the exit status is still 3.
* Rewrite the internals of Pl_Buffer to be much more efficient in
use of memory at a very slight performance cost. The old
implementation could cause memory usage to go out of control for
files with large images compressed using the TIFF predictor.
Fixes #228.
2018-08-05 Jay Berkenbilt <ejb@ql.org>
* Bug fix: end of line characters were not properly handled inside
strings in some cases. Fixes #226.
* Bug fix: infinite loop on progress reporting for very small
files. Fixes #230.
2018-08-04 Jay Berkenbilt <ejb@ql.org>
* Performance fix: optimize page merging operation to avoid
unnecessary open/close calls on files being merged. Fixes #217.
* Add ClosedFileInputSource::stayOpen method, enabling a
ClosedFileInputSource to stay open during manually indicated
periods of high activity, thus reducing the overhead of frequent
open/close operations.
2018-06-23 Jay Berkenbilt <ejb@ql.org>
* 8.1.0: release
2018-06-22 Jay Berkenbilt <ejb@ql.org>
* Bug fix: properly decrypt files with 40-bit keys that use
revision 3 of the security handler. Prior to this, qpdf was
reporting "invalid password" in this case. Fixes #212.
* With --verbose, print information about each input file when
merging files.
* Add progress reporting to QPDFWriter. Programmatically, you can
register a progress reporter with registerProgressReporter(). From
the command line, passing --progress will give progress indicators
in increments of no less than 1% as output files are written.
Fixes #200.
* Add new method QPDF::getObjectCount(). This gives an approximate
(upper bound) account of objects in the QPDF object.
* Don't leave files open when merging. This makes it possible
merge more files at once than the operating system's open file
limit. Fixes #154.
* Add ClosedFileInputSource class, and input source that keeps its
input file closed when not reading it. At the expense of some
performance, this allows you to operate on many files without
opening too many files at the operating system level.
* Add new option --preserve-unreferenced-resources, which
suppresses removal of unreferenced objects from page resource
dictionaries during page splitting operations.
2018-06-21 Jay Berkenbilt <ejb@ql.org>
* Add method QPDFPageObjectHelper::removeUnreferencedResources and
also QPDFPageDocumentHelper::removeUnreferencedResources that
calls the former on every page. This method removes any XObject or
Font references from the page's resource dictionary if they are
not referenced anywhere in any of the content streams. This
significantly reduces the size of split files whose pages
internally share resource dictionaries. Fixes #203.
* The --rotate option to qpdf no longer requires an explicit page
range. You can now rotate all pages of a document with
qpdf --rotate=angle in.pdf out.pdf. Fixes #211.
* Create examples/pdf-set-form-values.cc to illustrate use of
interactive form helpers.
* Added methods QPDFAcroFormDocumentHelper::setNeedAppearances and
added methods to QPDFFormFieldObjectHelper to set a field's value,
optionally updating the document to indicate that appearance
streams need to be regenerated.
* Added QPDFObject::newUnicodeString and QPDFObject::unparseBinary
to allow for more convenient creation of strings that are
explicitly encoded in UTF-16 BE. This is useful for creating
Unicode strings that appear outside of content streams, such as in
page labels, outlines, form field values, etc.
2018-06-20 Jay Berkenbilt <ejb@ql.org>
* Added new classes QPDFAcroFormDocumentHelper,
QPDFFormFieldObjectHelper, and QPDFAnnotationObjectHelper to
assist with working with interactive forms in PDF files. At
present, API methods for reading forms, form fields, and widget
annotations have been added. It is likely that some additional
methods for modifying forms will be added in the future. Note that
qpdf remains a library whose function is primarily focused around
document structure and metadata rather than content. As such, it
is not expected that qpdf will have higher level APIs for
generating form contents, but qpdf will hopefully gain the
capability to deal with the bookkeeping aspects of wiring up all
the objects, which could make it a useful library for other
software that works with PDF interactive forms. PDF forms are
complex, and the terminology around them is confusing. Please see
comments at the top of QPDFAcroFormDocumentHelper.hh for
additional discussion.
* Added new classes QPDFPageDocumentHelper and QPDFPageObjectHelper
for page-level API functions. These classes introduce a new API
pattern of document helpers and object helpers in qpdf. The helper
classes provide a higher level API for working with certain types
of structural features of PDF while still staying true to qpdf's
philosophy of not isolating the user from the underlying
structure. Please see the chapter in the documentation entitled
"Design and Library Notes" for additional discussion. The examples
have also been updated to use QPDFPageDocumentHelper and
QPDFPageObjectHelper when performing page-level operations.
2018-06-19 Jay Berkenbilt <ejb@ql.org>
* New QPDFObject::Rectangle class will convert to and from arrays
of four numerical values. Rectangles are used in various places
within the PDF file format and are called out as a specific data
type in the PDF specification.
2018-05-12 Jay Berkenbilt <ejb@ql.org>
* In newline before endstream mode, an extra newline was not
inserted prior to the endstream that ends object streams.
Fixes #205.
2018-04-15 Jay Berkenbilt <ejb@ql.org>
* Arbitrarily limit the depth of data structures represented by
direct object. This is CVE-2018-9918. Fixes #202.
2018-03-06 Jay Berkenbilt <ejb@ql.org>
* 8.0.2: release
* Properly handle pages with no contents. Fixes #194.
2018-03-05 Jay Berkenbilt <ejb@ql.org>
* Improve handling of loops while following cross reference
tables. Fixes #192.
2018-03-04 Jay Berkenbilt <ejb@ql.org>
* 8.0.1: release
* On the command line when specifying page ranges, support
preceding a page number by "r" to indicate that it should be
counted from the end. For example, the range r3-r1 would indicate
the last three pages of a document.
2018-03-03 Jay Berkenbilt <ejb@ql.org>
* Ignore zlib data check errors while uncompressing streams. This
is consistent with behaviors of other readers and enables handling
of some incorrectly written zlib streams. Fixes #191.
2018-02-25 Jay Berkenbilt <ejb@ql.org>
* 8.0.0: release
2018-02-17 Jay Berkenbilt <ejb@ql.org>
* Fix QPDFObjectHandle::getUTF8Val() to properly handle strings
that are encoded with PDF Doc Encoding. Fixes #179.
* Add qpdf_check_pdf to the "C" API. This method just attempts to
read the entire file and produce no output, making possible to
assess whether the file has any errors that qpdf can detect.
* Major enhancements to handling of type errors within the qpdf
library. This fix is intended to eliminate those annoying cases
where qpdf would exit with a message like "operation for
dictionary object attempted on object of wrong type" without
providing any context. Now qpdf keeps enough context to be able to
issue a proper warning and to handle such conditions in a sensible
way. This should greatly increase the number of bad files that
qpdf can recover, and it should make it much easier to figure out
what's broken when a file contains errors.
* Error message fix: replace "file position" with "offset" in
error messages that report lexical or parsing errors. Sometimes
it's an offset in an object stream or a content stream rather than
a file position, so this makes the error message less confusing in
those cases. It still requires some knowledge to find the exact
position of the error, since when it's not a file offset, it's
probably an offset into a stream after uncompressing it.
* Error message fix: correct some cases in which the object that
contained a lexical error was omitted from the error message.
* Error message fix: improve file name in the error message when
there is a parser error inside an object stream.
2018-02-11 Jay Berkenbilt <ejb@ql.org>
* Add QPDFObjectHandle::filterPageContents method to provide a
different interface for applying token filters to page contents
without modifying the ultimate output.
2018-02-04 Jay Berkenbilt <ejb@ql.org>
* Changes listed on today's date are numerous and reflect
significant enhancements to qpdf's lexical layer. While many
nuances are discussed and a handful of small bugs were fixed, it
should be emphasized that none of these issues have any impact on
any output or behavior of qpdf under "normal" operation. There are
some changes that have an effect on content stream normalization
as with qdf mode or on code that interacts with PDF files
lexically using QPDFTokenizer. There are no incompatible changes
for normal operation. There are a few changes that will affect the
exact error messages issued on certain bad files, and there is a
small non-compatible enhancement regarding the behavior of
manually constructed QPDFTokenizer::Token objects. Users of the
qpdf command line tool will see no changes other than the addition
of a new command-line flag and possibly some improved error
messages.
* Significant lexer (tokenizer) enhancements. These are changes to
the QPDFTokenizer class. These changes are of concern only to
people who are operating with PDF files at the lexical layer using
qpdf. They have little or no impact on most high-level interfaces
or the command-line tool.
New token types tt_space and tt_comment to recognize whitespace
and comments. this makes it possible to tokenize a PDF file or
stream and preserve everything about it.
For backward compatibility, space and comment tokens are not
returned by the tokenizer unless QPDFTokenizer.includeIgnorable()
is called.
Better handling of null bytes. These are now included in space
tokens rather than being their own "tt_word" tokens. This should
have no impact on any correct PDF file and has no impact on
output, but it may change offsets in some error messages when
trying to parse contents of bad files. Under default operation,
qpdf does not attempt to parse content streams, so this change is
mostly invisible.
Bug fix to handling of bad tokens at ends of streams. Now, when
allowEOF() has been called, these are treated as bad tokens
(tt_bad or an exception, depending on invocation), and a
separate tt_eof token is returned. Before the bad token
contents were returned as the value of a tt_eof token. tt_eof
tokens are always empty now.
Fix a bug that would, on rare occasions, report the offset in an
error message in the wrong space because of spaces or comments
adjacent to a bad token.
Clarify in comments exactly where the input source is positioned
surrounding calls to readToken and getToken.
* Add a new token type for inline images. This token type is only
returned by QPDFTokenizer immediately following a call to
expectInlineImage(). This change includes internal refactoring of
a handful of places that all separately handled inline images, The
logic of detecting inline images in content streams is now handled
in one place in the code. Also we are more flexible about what
characters may surround the EI operator that marks the end of an
inline image.
* New method QPDFObjectHandle::parsePageContents() to improve upon
QPDFObjectHandle::parseContentStream(). The parseContentStream
method used to operate on a single content stream, but was fixed
to properly handle pages with contents split across multiple
streams in an earlier release. The new method parsePageContents()
can be called on the page object rather than the value of the
page dictionary's /Contents key. This removes a few lines of
boiler-plate code from any code that uses parseContentStream, and
it also enables creation of more helpful error messages if
problems are encountered as the error messages can include
information about which page the streams come from.
* Update content stream parsing example
(examples/pdf-parse-content.cc) to use new
QPDFObjectHandle::parsePageContents() method in favor of the older
QPDFObjectHandle::parseContentStream() method.
* Bug fix: change where the trailing newline is added to a stream
in QDF mode when content normalization is enabled (the default for
QDF mode). Before, the content normalizer ensured that the output
ended with a trailing newline, but this had the undesired side
effect of including the newline in the stream data for purposes of
length computation. QPDFWriter already appends a newline without
counting in length for better readability. Ordinarily this makes
no difference, but in the rare case of a page's contents being
split in the middle of a token, the old behavior could cause the
extra newline to be interpreted as part of the token. This bug
could only be triggered in qdf mode, which is a mode intended for
manual inspection of PDF files' contents, so it is very unlikely
to have caused any actual problems for people using qpdf for
production use. Even if it did, it would be very unusual for a PDF
file to actually be adversely affected by this issue.
* Add support for coalescing a page's contents into a single
stream if they are represented as an array of streams. This can be
performed from the command line using the --coalesce-contents
option. Coalescing content streams can simplify things for
software that wants to operate on a page's content streams without
having to handle weird edge cases like content streams split in
the middle of tokens. Note that
QPDFObjectHandle::parsePageContents and
QPDFObjectHandle::parseContentStream already handled split content
streams. This is mainly to set the stage for new methods of
operating on page contents. The new method
QPDFObjectHandle::pipeContentStreams will pipe all of a page's
content streams though a single pipeline. The new method
QPDFObjectHandle.coalesceContentStreams, when called on a page
object, will do nothing if the page's contents are a single
stream, but if they are an array of streams, it will replace the
page's contents with a single stream whose contents are the
concatenation of the original streams.
* A few library routines throw exceptions if called on non-page
objects. These constraints have been relaxed somewhat to make qpdf
more tolerant of files whose page dictionaries are not properly
marked as such. Mostly exceptions about page operations being
called on non page objects will only be thrown in cases where the
operation had no chance of succeeding anyway. This change has no
impact on any default mode operations, but it could allow
applications that use page-level APIs in QPDFObjectHandle to be
more tolerant of certain types of damaged files.
* Add QPDFObjectHandle::TokenFilter class and methods to use it to
perform lexical filtering on content streams. You can call
QPDFObjectHandle::addTokenFilter on stream object, or you can call
the higher level QPDFObjectHandle::addContentTokenFilter on a page
object to cause the stream's contents to passed through a token
filter while being retrieved by QPDFWriter or any other consumer.
For details on using TokenFilter, please see comments in
QPDFObjectHandle.hh.
* Enhance the string, type QPDFTokenizer::Token constructor to
initialize a raw value in addition to a value. Tokens have a
value, which is a canonical representation, and a raw value. For
all tokens except strings and names, the raw value and the value
are the same. For strings, the value excludes the outer delimiters
and has non-printing characters normalized. For names, the value
resolves non-printing characters. In order to better facilitate
token filters that mostly preserve contents and to enable
developers to be mostly unconcerned about the nuances of token
values and raw values, creating string and name tokens now
properly handles this subtlety of values and raw values. When
constructing string tokens, take care to avoid passing in the
outer delimiters. This has always been the case, but it is now
clarified in comments in QPDFObjectHandle.hh::TokenFilter. This
has no impact on any existing code unless there's some code
somewhere that was relying on Token::getRawValue() returning an
empty string for a manually constructed token. The token class's
operator== method still only looks at type and value, not raw
value. For example, string tokens for <41> and (A) would still be
equal because both are representations of the string "A".
* Add QPDFObjectHandle::isDataModified method. This method just
returns true if addTokenFilter has been called on the stream. It
enables a caller to determine whether it is safe to optimize away
piping of stream data in cases where the input and output are
expected to be the same. QPDFWriter uses this internally to skip
the optimization of not re-compressing already compressed streams
if addTokenFilter has been called. Most developers will not have
to worry about this as it is used internally in the library in the
places that need it. If you are manually retrieving stream data
with QPDFObjectHandle::getStreamData or
QPDFObjectHandle::pipeStreamData, you don't need to worry about
this at all.
* Provide heavily annotated examples/pdf-filter-tokens.cc example
that illustrates use of some simple token filters.
* When normalizing content streams, as in qdf mode, issue warning
about bad tokens. Content streams are only normalized when this is
explicitly requested, so this has no impact on normal operation.
However, in qdf mode, if qpdf detects a bad token, it means that
either there's a bug in qpdf's lexer, that the file is damaged, or
that the page's contents are split in a weird way. In any of those
cases, qpdf could potentially damage the stream's contents by
replacing carriage returns with newlines or otherwise messing with
spaces. The mostly likely case of this would be an inline image's
compressed data being divided across two streams and having the
compressed data in the second stream contain a carriage return as
part of its binary data. If you are using qdf mode just to look at
PDF files in text editors, this usually doesn't matter. In cases
of contents split across multiple streams, coalescing streams
would eliminate the problem, so the warning mentions this. Prior
to this enhancement, the chances of qdf mode writing incorrect
data were already very low. This change should make it nearly
impossible for qdf mode to unknowingly write invalid data.
2018-02-04 Jay Berkenbilt <ejb@ql.org>
* Add QPDFWriter::setLinearizationPass1Filename method and
--linearize-pass1 command line option to allow specification of a
file into which QPDFWriter will write its intermediate
linearization pass 1 file. This is useful only for debugging qpdf.
qpdf creates linearized files by computing the output in two
passes. Ordinarily the first pass is discarded and not written
anywhere. This option allows it to be inspected.
2018-02-04 Jay Berkenbilt <ejb@ql.org>
* 7.1.1: release
* Bug fix: properly linearize files whose /ID has a length of
other than 16 bytes.
* Rename some test files to avoid files with three dots in their
names. Fixes #173.
* Fix various build and compilation issues on some platforms and
compilers. Fixes #176, #172, #177
* Fix a few typos and clarify a few comments in header files.
2018-01-14 Jay Berkenbilt <ejb@ql.org>
* 7.1.0: release
* Allow raw encryption key to be specified in library and command
line with the QPDF::setPasswordIsHexKey method and
--password-is-hex-key option. Allow encryption key to be displayed
with --show-encryption-key option. Thanks to Didier Stevens
<didier.stevens@gmail.com> for the idea and contribution of one
implementation of this idea. See his blog post at
https://blog.didierstevens.com/2017/12/28/cracking-encrypted-pdfs-part-3/
for a discussion of using this for cracking encrypted PDFs. I hope
that a future release of qpdf will include some additional
recovery options that may also make use of this capability.
2018-01-13 Jay Berkenbilt <ejb@ql.org>
* Fix lexical error: the PDF specification allows floating point
numbers to end with ".". Fixes #165.
* Fix link order in the build to avoid conflicts when building
from source while an older version of qpdf is installed. Fixes #158.
* Add support for TIFF predictor for LZW and Flate streams. Now
all predictor functions are supported. Fixes #171.
2017-12-25 Jay Berkenbilt <ejb@ql.org>
* Clarify documentation around options that control parsing but
not output creation. Two options: --suppress-recovery and
--ignore-xref-streams, were documented in the "Advanced
Transformation Options" section of the manual and --help output
even though they are not related to output. These are now
described in a separate section called "Advanced Parsing Options."
* Implement remaining PNG filters for decode. Prior versions could
decode only the "up" filter. Now all PNG filters (sub, up,
average, Paeth, optimal) are supported for decoding. Thanks to
Tobias Hoffmann for providing a test PDF file that has images with
all PNG filters along with different numbers of bits per sample
and samples per pixel, and thanks to Casey Rojas for providing
implementations of the remaining PNG filters.
The implementation of the remaining PNG filters changed the
interface to the private Pl_PNGFilter class, but this class's
header file is not in the installation, and there is no public
interface to the class. Within the library, the class is never
allocated on the stack; it is only ever dynamically allocated. As
such, this does not actually break binary compatibility of the
library.
2017-09-15 Jay Berkenbilt <ejb@ql.org>
* 7.0.0: release
2017-09-12 Jay Berkenbilt <ejb@ql.org>
* Relicense qpdf under version 2.0 of the Apache License rather
than version 2.0 of the Artistic License. Both are fine, but the
Apache License is in more widespread use, and I like it a little
better than Artistic-2.0. It is my intention that there be no
change in what you can or can't do with qpdf. Versions of qpdf
prior to version 7 were released under the terms of version 2.0 of
the Artistic License. At your option, you may continue to consider
qpdf to be licensed under those terms. Please see the manual for
additional information.
* Improve the error message that is issued when QPDFWriter
encounters a stream that can't be decoded. In particular, mention
that the stream will be copied without filtering to avoid data
loss.
* Add new methods to the C API to correspond to new additions to
QPDFWriter:
- qpdf_set_compress_streams
- qpdf_set_decode_level
- qpdf_set_preserve_unreferenced_objects
- qpdf_set_newline_before_endstream
2017-08-25 Jay Berkenbilt <ejb@ql.org>
* Re-implement parser iteratively to avoid stack overflow on very
deeply nested arrays and dictionaries. Fixes #146.
* Detect infinite loop while finding additional xref tables. Fixes
#149.
2017-08-22 Jay Berkenbilt <ejb@ql.org>
* 7.0.b1: release
* Convert all README files to markdown. Names changed as follows:
- README --> README.md
- README.hardening --> README-hardening.md
- README.maintainer --> README-maintainer.md
- README-what-to-download.txt --> README-what-to-download.md
- README-windows.txt --> README-windows.md
The file README-windows-install.txt remains a text file.
2017-08-21 Jay Berkenbilt <ejb@ql.org>
* Add support for writing PCLm files. Most of the work was done by
Sahil Arora <sahilarora.535@gmail.com> as part of a Google Summer
of Code project in 2017. PCLm support is useful only for clients
that specifically know how to create PCLm files. Support in qpdf
is just for ensuring that objects are written in the correct order
and for including some additional material in the output that is
required by the PCLm standard.
2017-08-19 Jay Berkenbilt <ejb@ql.org>
* Remove --precheck-streams. This is enabled by default now
without any efficiency cost. This feature was never released.
* Update pdf-create example to illustrate use of additional image
compression filters.
* Add support for /RunLengthDecode and /DCTDecode:
- New pipeline types Pl_RunLength and Pl_DCT
- New command-line flags --compress-streams and --decode-level
to replace/enhance --stream-data
- New QPDFWriter::setCompressStreams and
QPDFWriter::setDecodeLevel methods
Please see documentation, header files, and help messages for
details on these new features.
2017-08-12 Jay Berkenbilt <ejb@ql.org>
* Add QPDFObjectHandle::rotatePage to apply rotation to a page
object. Add --rotate option to qpdf to specify page rotation from
the command line.
* Provide --verbose option that causes qpdf to print an indication
of what files it is writing.
* Change --single-pages to --split-pages and make it take an
optional argument specifying the number of pages per file.
2017-08-11 Jay Berkenbilt <ejb@ql.org>
* Fix --newline-before-endstream to always add a newline before
endstream even if the last character was already a newline. This
is actually what's required by PDF/A. Fixes #133.
* Handle encrypted files whose encryption parameters are too
short. Fixes #96.
2017-08-10 Jay Berkenbilt <ejb@ql.org>
* Remove dependency on libpcre.
* Be more forgiving of certain types of errors in the xref table
that don't interfere with interpreting the table.
* Remove unused "tracing" parameter from PointerHolder's
(T*, bool) constructor. This change breaks source code
compatibility, but since this argument to PointerHolder has not
used for a long time and the presence of a boolean parameter in
the primary constructor makes it too easy to use that by mistake
when trying to use PointerHolder for arrays, it seems like it's
finally time to take it out. If you have a compile error because
of this change, please check to see whether you intended to use
the (bool, T*) version of the constructor instead. If not, just
remove the second parameter.
2017-08-09 Jay Berkenbilt <ejb@ql.org>
* When recovering stream length, find endobj without endstream as
well as just looking for endstream. Be a little more lax about
where we allow it to be found.
2017-08-05 Jay Berkenbilt <ejb@ql.org>
* Add --single-pages option to cause output to be written to a
separate file for each page rather than one big file.
* Process --pages options earlier so that certain inspection
options, like --show-pages, can show the state after the merging
operations.
2017-08-02 Jay Berkenbilt <ejb@ql.org>
* Fix off-by-one error in parsing pages options. Fixes #129.
2017-07-29 Jay Berkenbilt <ejb@ql.org>
* Support @filename and @- in the qpdf command-line tool to read
command-line arguments, one per line, from the named file. @-
reads from standard input. Fixes #16.
* Detect when input file and output file are the same and exit to
avoid overwriting and losing input file. Fixes #29.
* When passing multiple inspection arguments, run --check first,
and defer exit until after all the checks have been run. This
makes it possible to force operations such as --show-xref to be
delayed until after recovery attempts have been made. For example,
if you have a file with a syntactically valid xref table that has
some offsets that are incorrect, running qpdf --check --show-xref
on that file will first recover the xref and the dump the
recovered xref, while just running qpdf --show-xref will show the
xref table as present in the file. Fixes #42.
* When recovering stream length, indicate the recovered length.
Fixes #44.
* Add --newline-before-endstream command-line option and
setNewlineBeforeEndstream method to QPDFWriter. This forces qpdf
to always add a newline before the endstream keyword. It is a
necessary but not sufficient condition for PDF/A compliance. Fixes
#103.
* Handle zlib data errors when decoding streams. Fixes #106.
* Improve handling of files where the "stream" keyword is not
followed by proper line terminators. Fixes #104.
* Fix content stream parsing to handle cases of structures within
the stream split across stream boundaries. Fixes #73.
2017-07-28 Jay Berkenbilt <ejb@ql.org>
* Add --preserve-unreferenced command-line option and
setPreserveUnreferencedObjects method to QPDFWriter. This option
causes QPDFWriter to write all objects from the input file to the
output file regardless of whether the objects are referenced.
Objects are written to the output file in numerical order from the
input file. This option has no effect for linearized files.
2017-07-27 Jay Berkenbilt <ejb@ql.org>
* Add --precheck-streams command-line option and setStreamPrecheck
method to QPDFWriter to tell QPDFWriter to attempt decoding a
stream fully before deciding whether to filter it or not.
* Recover gracefully from streams that aren't filterable because
the filter parameters are invalid in the stream dictionary or the
dictionary itself is invalid.
* Significantly improve recoverability from invalid qpdf objects.
Most conditions in basic object parsing that used to cause qpdf to
exit are now warnings. There are still many more opportunities for
improvements of this sort beyond just object parsing.
2017-07-26 Jay Berkenbilt <ejb@ql.org>
* Fixes to infinite loops below also fix problems reported in
other issues and cover CVE-2017-11624, CVE-2017-11625,
CVE-2017-11626, and CVE-2017-11627.
* Don't attempt to interpret syntactic keywords (like R and
endobj) found while parsing content streams.
* Detect infinite loops while resolving objects. This could happen
if something inside an object that had to be resolved during
parsing, such as a stream length, recursively referenced the
object being resolved.
* CVE-2017-9208: Handle references to and appearance of object 0
as a special case. Object 0 is not allowed, and qpdf was using it
internally to represent direct objects.
* CVE-2017-9209: Fix infinite loop caused by attempting to
reconstruct the xref table while already in the process of
reconstructing the xref table.
* CVE-2017-9210: Fix infinite loop caused by attempting to unparse
an object for inclusion in the text of an exception.
2015-11-10 Jay Berkenbilt <ejb@ql.org>
* 6.0.0: release
* No changes from 5.2.0. The 5.2.0 release broke binary
compatibility and was withdrawn.
2015-10-31 Jay Berkenbilt <ejb@ql.org>
* 5.2.0: release
* libqpdf/QPDF.cc (read_xrefTable): Be tolerant of some malformed
xref tables that don't have the required trailing space after each
line.
2015-10-29 Jay Berkenbilt <ejb@ql.org>
* Implement QPDFWriter::setDeterministicID and --deterministic-id
commandline-flag to qpdf to request generation of a deterministic
/ID for non-encrypted files.
2015-05-24 Jay Berkenbilt <ejb@ql.org>
* 5.1.3: release
* Bug fix: fix-qdf was not handling object streams with more than
255 objects in them.
* Handle Microsoft crypt provider initialization properly for case
where no keys have been previously created, such as in a fresh
Windows installation.
* Include time.h in QUtil.hh for time_t
2015-02-21 Jay Berkenbilt <ejb@ql.org>
* Detect loops in Pages structure. Thanks to Gynvael Coldwind and
Mateusz Jurczyk of the Google Security Team for providing a sample
file with this problem.
* Prevent buffer overrun when converting a password to an
encryption key. Thanks to Gynvael Coldwind and Mateusz Jurczyk of
the Google Security Team for providing a sample file with this
problem.
* Ensure that arguments to "R" when parsing the file are direct
objects before trying to resolve them. This prevents specially
crafted files from causing qpdf to crash with a stack overflow.
Thanks to Gynvael Coldwind and Mateusz Jurczyk of the Google
Security Team for providing a sample file with this problem.
2014-12-01 Jay Berkenbilt <ejb@ql.org>
* Some broken PDF files lack the required /Type key for /Page and
/Pages nodes in the page dictionary. QPDF now uses other methods
to figure out what kind of node it is looking at so that it can
handle those files. Original reported at
https://bugs.launchpad.net/ubuntu/+source/qpdf/+bug/1397413
2014-11-14 Jay Berkenbilt <ejb@ql.org>
* Bug fix: QPDFObjectHandle::getPageContents() no longer throws an
exception when called on a page that has no /Contents key in its
dictionary. This is allowed by the spec, and some software
packages generate files like this for pages that are blank in the
original.
2014-06-07 Jay Berkenbilt <ejb@ql.org>
* 5.1.2: release
* MS Visual C++ build: explicitly target Windows 5.0.1 (XP)
* New example program: pdf-split-pages: efficiently split PDF
files into individual pages.
* Bug fix: don't fail on files that contain streams where /Filter
or /DecodeParms references a stream. Before, qpdf would try to
convert these to direct objects, which would fail because of the
stream.
2014-02-22 Jay Berkenbilt <ejb@ql.org>
* Bug fix: if the last object in the first part of a linearized
file had an offset that was below 65536 by less than the size of
the hint stream, the xref stream was invalid and the resulting file
is not usable. This is now fixed.
2014-01-14 Jay Berkenbilt <ejb@ql.org>
* 5.1.1: release
2013-12-26 Jay Berkenbilt <ejb@ql.org>
* Bug fix: when copying foreign objects (which occurs during page
splitting among other cases), avoid traversing the same object
more than once if it appears more than once in the same direct
object. This bug is performance-only and does not affect the
actual output.
2013-12-17 Jay Berkenbilt <ejb@ql.org>
* 5.1.0: release
2013-12-16 Jay Berkenbilt <ejb@ql.org>
* Document and make explicit that passing null to
QUtil::setRandomDataProvider() resets the random data provider.
* Provide QUtil::getRandomDataProvider().
2013-12-14 Jay Berkenbilt <ejb@ql.org>
* Allow any space rather than just newline to follow xref header.
This allows qpdf to read a wider range of damaged files.
2013-11-30 Jay Berkenbilt <ejb@ql.org>
* Allow user-supplied random data provider to be used in place of
OS-provided or insecure random number generation. See
documentation for 5.1.0 for details.
* Add configure option --enable-os-secure-random (enabled by
default). Pass --disable-os-secure-random or define
SKIP_OS_SECURE_RANDOM to avoid attempts to use the operating
system-provided secure random number generation. This can be
especially useful on Windows if you wish to avoid any dependency
on Microsoft's cryptography system.
2013-11-29 Jay Berkenbilt <ejb@ql.org>
* If NO_GET_ENVIRONMENT is #defined, for Windows only,
QUtil::get_env will always return false. This was added to
support a user who needs to avoid calling GetEnvironmentVariable
from the Windows API. QUtil::get_env is not used for any
functionality in qpdf and exists only to support the test suite
including test coverage support with QTC (part of qtest).
* Add /FS to msvc builds to allow parallel builds to work with
Visual C++ 2013.
* Add missing #include <algorithm> in some files that use std::min
and std::max.
2013-11-21 Jay Berkenbilt <ejb@ql.org>
* Change image comparison tests, which are disabled by default, to
use tiff files with 8 bits per sample rather than 4. This works
around a bug in tiffcmp but also increases time and disk space for
image comparison tests.
2013-10-28 Jay Berkenbilt <ejb@ql.org>
* Fix MacOS compilation errors by adding a missing #include
<string> in a header file.
2013-10-18 Jay Berkenbilt <ejb@ql.org>
* 5.0.1: release
* Warn when -accessibility=n is specified with a modern encryption
format (R > 3). Also, accept this flag (and ignore with warning)
with 256-bit encryption. qpdf has always ignored the
accessibility setting with R > 3, but it previously did so
silently.
2013-10-05 Jay Berkenbilt <ejb@ql.org>
* Replace operator[] in std::string and std::vector with "at" in
order to get bounds checking. This reduces the chances that
incorrect code will result in data exposure or buffer overruns.
See README.hardening for additional notes.
* Use cryptographically secure random number generation when
available. See additional notes in README.
* Replace some assert() calls with std::logic_error exceptions.
Ideally there shouldn't be assert() calls outside of testing.
This change may make a few more potential code errors in handling
invalid data recoverable.
* Security fix: In places where std::vector<T>(size_t) was used,
either validate that the size parameter is sane or refactor code
to avoid the need to pre-allocate the vector. This reduces the
likelihood of allocating a lot of memory in response to invalid
data in linearization hint streams.
* Security fix: sanitize /W array in cross reference stream to
avoid a potential integer overflow in a multiplication. It is
unlikely that any exploits were possible from this bug as
additional checks were also performed.
* Security fix: avoid buffer overrun that could be caused by bogus
data in linearization hint streams. The incorrect code could only
be triggered when checking linearization data, which must be
invoked explicitly. qpdf does not check linearization data when
reading or writing linearized files, but the qpdf --check command
does check linearization data.
* Security fix: properly handle empty strings in
QPDF_Name::normalizeName. The empty string is not a valid name
and would never be parsed as a name, so there were no known
conditions where this method could be called with an empty string.
* Security fix: perform additional argument sanity checks when
reading bit streams.
* Security fix: in QUtil::toUTF8, change bounds checking to avoid
having a pointer point temporarily outside the bounds of an
array. Some compiler optimizations could have made the original
code unsafe.
2013-07-10 Jay Berkenbilt <ejb@ql.org>
* 5.0.0: release
* 4.2.0 turned out to be binary incompatible on some platforms
even though there were no changes to the public API. Therefore
the 4.2.0 release has been withdrawn, and is being replaced with a
5.0.0 release that acknowledges the ABI change and also removes
some problematic methods from the public API.
* Remove methods from public API that were only intended to be
used by QPDFWriter and really didn't make sense to call from
anywhere else as they required internal knowledge that only
QPDFWriter had:
- QPDF::getLinearizedParts
- QPDF::generateHintStream
- QPDF::getObjectStreamData
- QPDF::getCompressibleObjGens
- QPDF::getCompressibleObjects
2013-07-07 Jay Berkenbilt <ejb@ql.org>
* 4.2.0: release [withdrawn]
* Ignore error case of a stream's decode parameters having invalid
length when there are no stream filters.
* qpdf: add --show-npages command-line option, which causes the
number of pages in the input file to be printed on a line by
itself.
* qpdf: allow omission of range in --pages. If range is omitted
such that an argument that is supposed to be a range is an invalid
range and a valid file name, the range of 1-z is assumed. This
makes it possible to merge a bunch of files with something like
qpdf --empty out.pdf --pages *.pdf --
2013-06-15 Jay Berkenbilt <ejb@ql.org>
* Handle some additional broken files with missing /ID in trailer
for encrypted files and with space rather than newline after xref.
2013-06-14 Jay Berkenbilt <ejb@ql.org>
* Detect and correct /Outlines dictionary being a direct object
when linearizing files. This is not allowed by the spec but has
been seen in the wild. Prior to this change, such a file would
cause an internal error in the linearization code, which assumed
/Outlines was indirect.
* Add /Length key to crypt filter dictionary for encrypted files.
This key is optional, but some version of MacOS reportedly fail to
open encrypted PDF files without this key.
* Bug fix: properly handle object stream generation when the
original file has some compressible objects with generation != 0.
* Add QPDF::getCompressibleObjGens() and deprecate
QPDF::getCompressibleObjects(), which had a flaw in its logic.
* Add new QPDFObjectHandle::getObjGen() method and indicate in
comments that its use is favored over getObjectID() and
getGeneration() for most cases.
* Add new QPDFObjGen object to represent an object ID/generation
pair.
2013-04-14 Jay Berkenbilt <ejb@ql.org>
* 4.1.0: release
2013-03-25 Jay Berkenbilt <ejb@ql.org>
* manual/qpdf-manual.xml: Document the casting policy that is
followed in qpdf's implementation.
2013-03-11 Jay Berkenbilt <ejb@ql.org>
* When creating Windows binary distributions, make sure to only
copy DLLs of the correct type. The ensures that the 32-bit
distributions contain 32-bit DLLs and the 64-bit distributions
contain 64-bit DLLs.
2013-03-07 Jay Berkenbilt <ejb@ql.org>
* Use ./install-sh (already present) instead of "install -c" to
install executables to fix portability problems against different
UNIX variants.
2013-03-03 Jay Berkenbilt <ejb@ql.org>
* Add protected terminateParsing method to
QPDFObjectHandle::ParserCallbacks that implementor can call to
terminate parsing of a content stream.
2013-02-28 Jay Berkenbilt <ejb@ql.org>
* Favor fopen_s and strerror_s on MSVC to avoid CRT security
warnings. This is useful for people who may want to use qpdf in
an application that is Windows 8 certified.
* New method QUtil::safe_fopen to wrap calls to fopen. This is
less cumbersome than calling QUtil::fopen_wrapper.
* Remove all calls to sprintf
* New method QUtil::int_to_string_base to convert to octal or
hexadecimal (or decimal) strings without using sprintf
2013-02-26 Jay Berkenbilt <ejb@ql.org>
* Rewrite QUtil::int_to_string and QUtil::double_to_string to
remove internal length limits but to remain backward compatible
with the old versions for valid inputs.
2013-02-23 Jay Berkenbilt <ejb@ql.org>
* Bug fix: properly handle overridden compressed objects. When
caching objects from an object stream, only cache objects that,
based on the xref table, would actually be resolved into this
stream. Prior to this fix, if an object stream A contained an
object B that was overridden by an appended section of the file,
qpdf would cache the old value of B if any non-overridden member
of A was accessed before B. This commit fixes that bug.
2013-01-31 Jay Berkenbilt <ejb@ql.org>
* Do not remove libtool's .la file during the make install step.
Note to packagers: if your distribution wants to you remove the
.la file, you will have to do that yourself now.
2013-01-25 Jay Berkenbilt <ejb@ql.org>
* New method QUtil::hex_encode to encode binary data as a
hexadecimal string
* qpdf --check was exiting with status 0 in some rare cases even
when errors were found. It now always exits with one of the
document error codes (0 for success, 2 for errors, 3 or warnings).
2013-01-24 Jay Berkenbilt <ejb@ql.org>
* Make --enable-werror work for MSVC, and generally handle warning
options better for that compiler. Warning flags for that compiler
were previous hard-coded into the build with /WX enabled
unconditionally.
* Split warning flags into WFLAGS in autoconf.mk to make them
easier to override. Before they were repeated in CFLAGS and
CXXFLAGS and were commingled with other compiler flags.
* qpdf --check now does syntactic checks all pages' content
streams as well as checking overall document structure. Semantic
errors are still not checked, and there are no plans to add
semantic checks.
2013-01-22 Jay Berkenbilt <ejb@ql.org>
* Add QPDFObjectHandle::getTypeCode(). This method returns a
unique integer (enumerated type) value corresponding to the object
type of the QPDFObjectHandle. It can be used as an alternative to
the QPDFObjectHandle::is* methods for type testing, particularly
where there is a desire to use a switch statement or optimize for
performance when testing object types.
* Add QPDFObjectHandle::getTypeName(). This method returns a
string literal describing the object type. It is useful for
testing and debugging.
2013-01-20 Jay Berkenbilt <ejb@ql.org>
* Add QPDFObjectHandle::parseContentStream, which parses the
objects in a content stream and calls handlers in a callback
class. The example pdf-parse-content illustrates it use.
* Add QPDF_Operator and QPDF_InlineImage types along with
appropriate wrapper methods in QPDFObjectHandle. These new object
types are to facilitate content stream parsing.
2013-01-17 Jay Berkenbilt <ejb@ql.org>
* 4.0.1: release
* Add clarifying comment in QPDF.hh for methods that return the
user password to state that it is no longer possible with newer
encryption formats to recover the user password knowing the owner
password.
* Fix detection of binary attachments in the test suite. This
resolves false test failures on some platforms. No changes to the
actual QPDF code were made.
2012-12-31 Jay Berkenbilt <ejb@ql.org>
* 4.0.0: release
* Add new methods qpdf_get_pdf_extension_level,
qpdf_set_r5_encryption_parameters,
qpdf_set_r6_encryption_parameters,
qpdf_set_minimum_pdf_version_and_extension, and
qpdf_force_pdf_version_and_extension to support new functionality
from the C API.
2012-12-30 Jay Berkenbilt <ejb@ql.org>
* Fix long-standing bug that could theoretically have resulted in
possible misinterpretation of decode parameters in streams. As
far as I can tell, it is extremely unlikely that files with the
characteristics that would have triggered the bug actually exist
in cases that qpdf versions prior to 4.0.0 could have read.
Unencrypted files with encrypted attachments would have triggered
this bug, but qpdf versions prior to 4.0.0 already refused to open
such files.
* Fix long-standing bug in which a stream that used a crypt
filter and was otherwise not filterable by qpdf would be decrypted
properly but would retain the crypt filter indication in the
file. There are no known ways to create files like this, so it is
unlikely that anyone ever hit this bug.
2012-12-29 Jay Berkenbilt <ejb@ql.org>
* Add read/write support for both the deprecated Acrobat IX
encryption format and the Acrobat X/PDF 2.0 encryption format
using 256-bit AES keys. Using the Acrobat IX format (R=5) forces
the version of the file to 1.7 with extension level 3. Using the
PDF 2.0 format (R=6) forces it to 1.7 extension level 8.
* Add new method QPDF::getEncryptionKey to return the actual
encryption key used for encryption of data in the file. The key
is returned as a std::string.
* Non-compatible API change: change signature of
QPDF::compute_data_key to take the R and V values from the
encryption dictionary. There is no reason for any application
code to call this method since handling of encryption is done
automatically by the qpdf library. It is used internally by
QPDFWriter.
* Support reading and decryption of files whose main text is not
encrypted but whose attachments are. More generally, support the
case of files and streams encrypted differently with some
limitations, described in the documentation. This was not
previously supported due to lack of test files, but I created test
files using a trial version of Acrobat XI to fully implement this
case.
* Incorporate sha2 code from sphlib 3.0. See README for
licensing. Create private pipeline class for computing hashes
with sha256, sha384, and sha512.
* Allow specification of initialization vector when using AES
filtering. This is required to compute the hash used in /R=6 (PDF
2.0) encryption.
2012-12-28 Jay Berkenbilt <ejb@ql.org>
* Add random number generation functions to QUtil.
* Fix old bug that could cause an infinite loop if user password
recovery methods were called and a password contained the "("
character (which happens to be the first byte of padding used by
older PDF encryption formats). This bug was noticed while reading
code and would not happen under ordinary usage patterns even if
the password contained that character.
2012-12-27 Jay Berkenbilt <ejb@ql.org>
* Add awareness of extension level to PDF Version methods for both
reading and writing. This includes adding method
QPDF::getExtensionLevel and new versions of
QPDFWriter::setMinimumPDFVersion and QPDFWriter::forcePDFVersion
that support extension levels. The qpdf command-line tool
interprets version numbers of the form x.y.z as version x.y at
extension level z.
* Update AES classes to support use of 256-bit keys.
* Non-compatible API change: Removed public method
QPDF::flattenScalarReferences. Instead, just flatten the scalar
references we actually need to flatten. Flattening scalar
references was a wrong decision years ago and has occasionally
caused other problems, among which were that it caused qpdf to
visit otherwise unreferenced and possibly erroneous objects in the
file when it didn't have to. There's no reason that any
non-internal code would have had to call this.
* Non-compatible API change: Removed public method
QPDF::decodeStreams which was previously used by qpdf --check but
is no longer used. The decodeStreams method could generate false
positives since it would attempt to access all objects in the file
including those that were not referenced. There's no reason that
any non-internal code would have had to call this.
* Non-compatible API change: Removed public method
QPDF::trimTrailerForWrite, which was only intended for use by
QPDFWriter and which is no longer used.
2012-12-26 Jay Berkenbilt <ejb@ql.org>
* Add new fields to QPDF::EncryptionData to support newer
encryption formats (V=5, R=5 and R=6)
* Non-compatible API change: Change public nested class
QPDF::EncryptionData to make all member fields private and to add
method calls. This is a non-compatible API change, but changing
EncryptionData is necessary to support newer encryption formats,
and making this change will prevent the need from making a
non-compatible change in the future if new fields are added. A
public nested class should never have had public members to begin
with.
2012-12-25 Jay Berkenbilt <ejb@ql.org>
* Allow PDF header to appear anywhere in the first 1024 bytes of
the file as recommended in the implementation notes of the Adobe
version of the PDF spec.
2012-11-20 Jay Berkenbilt <ejb@ql.org>
* Add zlib and libpcre to Requires.private in the pkg-config file
to support static linking. Thanks Tobias Hoffmann for pointing
out the omission.
* Ignore (with warning) non-freed objects in the xref table whose
offset is 0. Some PDF producers (incorrectly) do this. See
https://bugs.linuxfoundation.org/show_bug.cgi?id=1081.
2012-09-23 Jay Berkenbilt <ejb@ql.org>
* Add public methods QPDF::processInputSource and
QPDFWriter::setOutputPipeline to allow users to read from custom
input sources and to write to custom pipelines. This allows the
maximum flexibility in sources for reading and writing PDF files.
2012-09-06 Jay Berkenbilt <ejb@ql.org>
* 3.0.2: release
* Add new method QPDFWriter::setExtraHeaderText to add extra text,
such as application-specific comments, to near the beginning of a
PDF file. For linearized files, this appears after the
linearization parameter dictionary. For non-linearized files, it
appears right after the PDF header and non-ASCII comment.
* Make it possible to write the same QPDF object with two
different QPDFWriter objects that have both called
setLinearization(true) by making private method
QPDF::calculateLinearizationData() properly initialize its state.
* Bug fix: Writing after calling QPDFWriter::setOutputMemory()
would cause a segmentation fault because of an internal field not
being initialized, rendering that method useless. This has been
corrected.
2012-08-11 Jay Berkenbilt <ejb@ql.org>
* 3.0.1: release
* Bug fix: let EOF terminate a literal token as well as
whitespace or comments.
2012-07-31 Jay Berkenbilt <ejb@ql.org>
* 3.0.0: release
2012-07-29 Jay Berkenbilt <ejb@ql.org>
* 3.0.rc1: release
2012-07-25 Jay Berkenbilt <ejb@ql.org>
* From Tobias: add QPDFObjectHandle::replaceStreamData that takes
a std::string analogous to the QPDFObjectHandle::newStream that
takes a string that was added earlier.
2012-07-21 Jay Berkenbilt <ejb@ql.org>
* Change configure to have image comparison tests disabled by
default. Update README and README.maintainer with information
about running them.
* Add --pages command-line option to qpdf to enable page-based
merging and splitting.
* Add new method QPDFObjectHandle::replaceDict to replace a
stream's dictionary. Use with caution; see comments in
QPDFObjectHandle.hh.
* Add new method QPDFObjectHandle::parse for creation of
QPDFObjectHandle objects from string representations of the
objects. Thanks to Tobias Hoffmann for the idea.
2012-07-15 Jay Berkenbilt <ejb@ql.org>
* add new QPDF::isEncrypted method that returns some additional
information beyond other versions.
* libqpdf/QPDFWriter.cc: fix copyEncryptionParameters to fix the
minimum PDF version based on other file's encryption needs. This
is a fix to code added on 2012-07-14 and did not impact previously
released code.
* libqpdf/QPDFWriter.cc (copyEncryptionParameters): Bug fix: qpdf
was not preserving whether or not AES encryption was being used
when copying encryption parameters. The file would still have
been properly encrypted, but a file that started off encrypted
with AES could have become encrypted with RC4.
2012-07-14 Jay Berkenbilt <ejb@ql.org>
* QPDFWriter: add public copyEncryptionParameters to allow copying
encryption parameters from another file.
* QPDFWriter: detect if the user has inserted an indirect object
from another QPDF object and throw an exception directing the user
to copyForeignObject.
2012-07-11 Jay Berkenbilt <ejb@ql.org>
* Added new APIs to copy objects from one QPDF to another. This
includes letting QPDF::addPage() (and QPDF::addPageAt()) accept a
page object from another QPDF and adding
QPDF::copyForeignObject(). See QPDF.hh for details.
* Add method QPDFObjectHandle::getOwningQPDF() to return the QPDF
object associated with an indirect QPDFObjectHandle.
* Add convenience methods to QPDFObjectHandle: assertIndirect(),
isPageObject(), isPagesObject()
* Cache when QPDF::pushInheritedAttributesToPage() has been called
to avoid traversing the pages trees multiple times. This state is
cleared by QPDF::updateAllPagesCache() and ignored by
QPDF::flattenPagesTree().
2012-07-08 Jay Berkenbilt <ejb@ql.org>
* Add QPDFObjectHandle::newReserved to create a reserved object
and QPDF::replaceReserved to replace it with a real object.
QPDFObjectHandle::newReserved reserves an object ID in a QPDF
object and ensures that any references to it remain unresolved.
When QPDF::replaceReserved is later called, previous references to
the reserved object will properly resolve to the replaced object.
2012-07-07 Jay Berkenbilt <ejb@ql.org>
* NOTE: BREAKING API CHANGE. Remove previously required length
parameter from the version QPDFObjectHandle::replaceStreamData
that uses a stream data provider. Prior to qpdf 3.0.0, you had to
compute the stream length in advance so that qpdf could internally
verify that the stream data had the same length every time the
provider was invoked. Now this requirement is enforced a
different way, and the length parameter is no longer required.
Note that I take API-breaking changes very seriously and only did
it in this case since the lack of need to know length in advance
could significantly simplify people's code. If you were
previously going to a lot of trouble to compute the length of the
new stream data in advance, you now no longer have to do that.
You can just drop the length parameter and remove any code that
was previously computing the length. Thanks to Tobias Hoffmann
for pointing out how annoying the original interface was.
2012-07-05 Jay Berkenbilt <ejb@ql.org>
* Add QPDFWriter methods to write to an already open stdio FILE*.
Implementation and idea area based on contributions from Tobias
Hoffmann.
2012-07-04 Jay Berkenbilt <ejb@ql.org>
* Accept changes from Tobias Hoffmann: add public method
QPDF::pushInheritedAttributesToPage including warnings for
non-inherited keys that may be discarded from /Pages by
non-conformant PDF files when the /Pages tree is flattened.
2012-06-27 Jay Berkenbilt <ejb@ql.org>
* Add Pl_Concatenate pipeline for stream concatenation also
implemented by Tobias Hoffmann. Also added test code
(libtests/concatenate.cc).
* Add new methods implemented by Tobias Hoffmann:
QPDFObjectHandle::newReal(double) and
QPDFObjectHandle::newStream(QPDF*, std::string const&).
2012-06-26 Jay Berkenbilt <ejb@ql.org>
* Minor changes so that support for PDF files larger than 4GB
works well with 32-bit and 64-bit Linux and also with 32-bit and
64-bit Windows with both MSVC and mingw.
* Rework internal methods for doing recovery of the cross
reference tables for much greater efficiency both in terms of time
and memory usage.
2012-06-24 Jay Berkenbilt <ejb@ql.org>
* Support PDF files larger than 4 GB. This involved many changes
to the ABI to increase the size of integer types used in various
places as well as increasing the amount of padding used when
creating linearized files. Automated tests for large files are
disabled by default. Run ./configure --help for information on
enabling them. Running the tests requires 11 GB of free disk
space and takes several minutes.
2012-06-22 Jay Berkenbilt <ejb@ql.org>
* examples/pdf-create.cc: Provide an example of creating a PDF
from scratch. This simple PDF has a single page with some text
and an image.
* Add empty QPDFObjectHandle factories for array and dictionary.
With PDF-from-scratch capability, it is useful to be able to
create empty arrays and dictionaries and add keys to them.
Updated pdf_from_scratch.cc to use these interfaces.
2012-06-21 Jay Berkenbilt <ejb@ql.org>
* Add QPDF::emptyPDF() to create an empty QPDF object suitable for
adding pages and other objects to. pdf_from_scratch.cc is test
code that exercises it.
* make/libtool.mk: Place user-specified CPPFLAGS and LDFLAGS later
in the compilation so that if a user installs things in a
non-standard place that they have to tell the build about, earlier
versions of qpdf installed there won't break the build. Thanks to
Macports for reporting this. (Fixes bug 3468860.)
* Instead of using off_t in the public APIs, use qpdf_offset_t
instead. This is defined as long long in qpdf/Types.h. If your
system doesn't support long long, you can redefine it.
* Add pkg-config files
* QPDFObjectHandle: add shallowCopy() method
* QPDF: add new APIs for adding and removing pages. This includes
addPage(), addPageAt(), and removePage(). Also a method
updateAllPagesCache() is now available to force update of the
internal pages cache if you should modify the pages structure
manually.
* QPDF: new processFile method that takes an open FILE*
instead of a filename.
2012-06-20 Jay Berkenbilt <ejb@ql.org>
* Add new array mutation routines to QPDFObjectHandle.
Implemented by Tobias Hoffmann.
* Rework APIs that use size_t, off_t, and primitive integer types
so that size_t is used for sizes of memory and off_t is used for
file offsets. Also set _FILE_OFFSET_BITS so that large files can
be supported on 32-bit UNIX/Linux platforms. The code assumes in
places that sizeof(off_t) >= sizeof(size_t). This resulted in
non-compatible ABI changes and hopefully clears the way for QPDF
to work with files that are larger than 4 GiB in size.
* Add support for versioned symbols on ELF platforms.
* Various fixes for gcc 4.7
2011-04-06 Jay Berkenbilt <ejb@ql.org>
* Fix PCRE to stop using deprecated (and now dropped) interfaces.
2011-12-28 Jay Berkenbilt <ejb@ql.org>
* 2.3.1: release
* include <stdint.h> if available to support MSVC 2010
* Since PCRE is not necessarily thread safe, don't declare any
PCRE objects to be static.
* Disregard stderr output from ghostscript when using it to
compare images in the test suite; see comments in qpdf.test for
details.
* Fixed a few documentation errors.
2011-08-11 Jay Berkenbilt <ejb@ql.org>
* 2.3.0: release
* include/qpdf/qpdf-c.h ("C"): add new methods
qpdf_init_write_memory, qpdf_get_buffer_length, and
qpdf_get_buffer to support writing to memory from the C API.
* include/qpdf/qpdf-c.h ("C"): add new methods qpdf_get_info_key
and qpdf_set_info_key for manipulating text fields of the /Info
dictionary.
2011-08-10 Jay Berkenbilt <ejb@ql.org>
* libqpdf/QPDFWriter.cc (copyEncryptionParameters): preserve
whether metadata is encryption. This fixes part of bug 3173659:
the password becomes invalid if qpdf copies an encrypted file with
cleartext-metadata.
* include/qpdf/QPDFWriter.hh: add a new constructor that takes
only a QPDF reference and leaves specification of output for
later. Add methods setOutputFilename() to set the output to a
filename or stdout, and setOutputMemory() to indicate that output
should go to a memory buffer. Add method getBuffer() to retrieve
the buffer used if output was saved to a memory buffer.
* include/qpdf/QPDF.hh: add methods replaceObject() and
swapObjects() to allow replacement of an object and swapping of
two objects by object ID.
* include/qpdf/QPDFObjectHandle.hh: add new methods getDictAsMap()
and getArrayAsVector() for returning the elements of a dictionary
or an array as a map or vector.
2011-06-25 Jay Berkenbilt <ejb@ql.org>
* 2.2.4: release
2011-06-23 Jay Berkenbilt <ejb@ql.org>
* make/libtool.mk (install): Do not strip executables and shared
libraries during installation. Leave that up to the packager.
* configure.ac: disable -Werror by default.
2011-05-07 Jay Berkenbilt <ejb@ql.org>
* libqpdf/QPDF_linearization.cc (isLinearized): remove unused
offset variable, found by a gcc 4.6 warning.
2011-04-30 Jay Berkenbilt <ejb@ql.org>
* 2.2.3: release
* libqpdf/QPDF.cc (readObjectInternal): Accept the case of the
stream keyword being followed by carriage return by itself. While
this is not permitted by the specification, there are PDF files
that do this, and other readers can read them.
* libqpdf/Pl_QPDFTokenizer.cc (processChar): When an inline image
is detected, suspend normalization only up to the end of the
inline image rather than for the remainder of the content stream.
(Fixes qpdf-Bugs 3152169.)
2011-01-31 Jay Berkenbilt <ejb@ql.org>
* libqpdf/QPDF.cc (readObjectAtOffset): use -1 rather than 0 when
reading an object at a given to indicate that no object number is
expected. This allows xref recovery to proceed even if a file
uses the invalid object number 0 as a regular object.
* libqpdf/QPDF_linearization.cc (isLinearized): use -1 rather than
0 as a sentinel for not having found the first object in the
file. Since -1 can never match the regular expression, this
prevents an infinite loop when checking a file that starts with
(erroneous) 0 0 obj. (Fixes qpdf-Bugs-3159950.)
2010-10-04 Jay Berkenbilt <ejb@ql.org>
* 2.2.2: release
* include/qpdf/qpdf-c.h: Add qpdf_read_memory to C API to call
QPDF::processMemoryFile.
2010-10-01 Jay Berkenbilt <ejb@ql.org>
* 2.2.1: release
* include/qpdf/QPDF.hh: Add setOutputStreams method to allow
redirection of library-generated output/error to alternative
streams.
* include/qpdf/QPDF.hh: Add processMemoryFile method for
processing a PDF file from a memory buffer instead of a file.
2010-09-24 Jay Berkenbilt <ejb@ql.org>
* libqpdf/QPDF.cc: change private "file" method to be a
PointerHolder<InputSource> to prepare qpdf for being able to work
with PDF files loaded into memory in addition to working with
files on disk.
* include/qpdf/PointerHolder.hh: add operator* and operator->
methods so that PointerHolder objects can be used like pointers.
This is consistent with the smart pointer objects in the next
revision of C++.
2010-09-05 Jay Berkenbilt <ejb@ql.org>
* libqpdf/QPDF.cc (readObjectInternal): Recognize empty objects
and treat them as null.
* libqpdf/QPDF_Stream.cc (filterable): Handle inline image filter
abbreviations as stream filter abbreviations. Although this is
not technically allowed by the PDF specification, table H.1 in the
pre-ISO spec indicates that Adobe's readers accept them. Thanks
to Jian Ma <stronghorse@tom.com> for bringing this to my
attention.
2010-08-14 Jay Berkenbilt <ejb@ql.org>
* 2.2.0: release
* Rename README.windows to README-windows.txt and convert its line
endings to Windows-style line endings. Also mention Jian Ma's VC6
port in the manual and README-windows.txt.
2010-08-09 Jay Berkenbilt <ejb@ql.org>
* Add QPDFObjectHandle::getRawStreamData to return raw
(unfiltered) stream data.
2010-08-08 Jay Berkenbilt <ejb@ql.org>
* 2.2.rc1: release
2010-08-05 Jay Berkenbilt <ejb@ql.org>
* Add QPDFObjectHandle::addPageContents, a convenience routine for
appending or prepending new streams to a page's content streams.
The "pdf-double-page-size" example illustrates its use.
* Add new methods to QPDFObjectHandle: replaceStreamData and
newStream. These methods allow users of the qpdf library to add
new streams and to replace data of existing streams. The
"pdf-double-page-size" and "pdf-invert-images" examples illustrate
their use.
2010-06-06 Jay Berkenbilt <ejb@ql.org>
* Fix memory leak for QPDF objects whose underlying PDF objects
contain circular references. Thanks to Jian Ma
<stronghorse@tom.com> for calling my attention to the memory leak.
2010-04-25 Jay Berkenbilt <ejb@ql.org>
* 2.1.5: release
* libqpdf/QPDF_encryption.cc (compute_encryption_key): remove
restrictions on length of file identifier string. (Fixes
qpdf-Bugs-2991412.)
2010-04-18 Jay Berkenbilt <ejb@ql.org>
* 2.1.4: release
* libqpdf/QPDFWriter.cc (writeLinearized): the padding calculation
fix in 2.1.2 was applied in only one place but it was needed in
two places since there are actually two cross reference streams in
a linearized file. The new padding calculation is now used for
both streams. Hopefully this should put an end to linearization
padding problems. (Fixes qpdf-Bugs-2979219.)
2010-04-10 Jay Berkenbilt <ejb@ql.org>
* qpdf/qpdf.cc (main): Since qpdf --check only checks syntax and
stream encoding without doing any semantic checks, make the output
clearer when no errors around found. This is inspired by
qpdf-Bugs-2983225.
2010-03-27 Jay Berkenbilt <ejb@ql.org>
* 2.1.3: release
* libqpdf/QPDF_optimization.cc (flattenScalarReferences): Flatten
scalar references for unreferenced objects as well as those seen
during traversal of the file. This matters when preserving object
streams that contain unreferenced objects with indirect scalars.
(Fixes qpdf-Bugs-2974522.) Updated TODO with a description of a
possibly better fix involving removal of flattenScalarReferences.
* libqpdf/Pl_AES_PDF.cc (finish): Don't complain if an AES input
buffer is not a multiple of 16 bytes. Instead, just pad with
nulls and hope for the best. PDF files have been encountered "in
the wild" that contain AES buffers that aren't a multiple of 16
bytes.
2010-01-24 Jay Berkenbilt <ejb@ql.org>
* 2.1.2: release
* libqpdf/QPDFWriter.cc: fix logic error in padding calculation.
When writing linearized files with cross reference streams, the
padding calculation failed to take differences in sizes of
compressed data between pass 1 and pass 2 into consideration.
2009-12-14 Jay Berkenbilt <ejb@ql.org>
* 2.1.1: release
* qpdf/qtest/qpdf.test: improve test for acroread to make sure it
actually works and is not just present in the path.
2009-12-13 Jay Berkenbilt <ejb@ql.org>
* libqpdf/qpdf/Pl_AES_PDF.hh: include <stdint.h>, if available, so
we have valid definitions of uint32_t.
2009-10-30 Jay Berkenbilt <ejb@ql.org>
* 2.1: release
* libqpdf/QPDF.cc: be more forgiving of extraneous whitespace in
the xref table and while recovering from error conditions.
2009-10-26 Jay Berkenbilt <ejb@ql.org>
* Work around failure of PCRE test case; this test case exercises
an aspect of PCRE that qpdf does not use, and the test fails with
the version of PCRE on Red Hat Enterprise Linux 5, so we ignore
failure on this particular test case.
* Fix RPM .spec file to include "C" examples
2009-10-24 Jay Berkenbilt <ejb@ql.org>
* 2.1.rc1: release
* Provide interfaces for getting qpdf's own version number
2009-10-19 Jay Berkenbilt <ejb@ql.org>
* include/qpdf/QPDF.hh (QPDF): getWarnings now returns a list of
QPDFExc rather than a list of strings. This way, warnings may be
inspected in more detail.
* Include information about the last object read in most error
messages. Most of the time, this will provide a good hint as to
which object contains the error, but it's possible that the last
object read may not necessarily be the one that has the error if
the erroneous object was previously read and cached.
2009-10-18 Jay Berkenbilt <ejb@ql.org>
* If forcing version, disable object stream creation and/or
encryption if previous specifications are incompatible with new
version. It is still possible that PDF content, compression
schemes, etc., may be incompatible with the new version, but at
least this way, older viewers will at least have a chance.
* libqpdf/QPDFWriter.cc (unparseObject): avoid compressing
Metadata streams if possible.
2009-10-13 Jay Berkenbilt <ejb@ql.org>
* Upgrade embedded qtest to version 1.4, which allows the test
suite to be run in Windows with MSYS and ActiveState Perl rather
than requiring Cygwin perl.
2009-10-04 Jay Berkenbilt <ejb@ql.org>
* Implement support AES encrypt and crypt filters. Implementation
is not fully tested due to lack of test data but has been tested
for several cases.
2009-10-04 Jay Berkenbilt <ejb@ql.org>
* Add methods to QPDFWriter and corresponding command line
arguments to qpdf to set the minimum output PDF version and also
to force the version to a particular value.
* libqpdf/QPDF.cc (processXRefStream): warn and ignore extra xref
stream entries when stream is larger than reported size. This
used to be a fatal error. (Fixes qpdf-Bugs-2872265.)
2009-09-27 Jay Berkenbilt <ejb@ql.org>
* Add several methods to query permissions controlled by the
encryption dictionary. Note that qpdf does not enforce these
permissions even though it allows the user to query them.
* The function QPDF::getUserPassword returned the user password
with the required padding as specified by the PDF specification.
This is seldom useful to users. This function has been replaced
by QPDF::getPaddedUserPassword. Call the new
QPDF::getTrimmedUserPassword to retrieve the user password in a
human-readable format.
* qpdf/qpdf.cc (main): qpdf --check now prints the PDF version
number in addition to its other output.
2009-09-26 Jay Berkenbilt <ejb@ql.org>
* Removed all references to QEXC; now using std::runtime_error and
std::logic_error and their subclasses for all exceptions.
2009-05-03 Jay Berkenbilt <ejb@ql.org>
* 2.0.6: release
* libqpdf/QPDF_Stream.cc (filterable): ignore /DecodeParms if it's
not a type we recognize. (Fixes qpdf-Bugs-2779746.)
2009-03-10 Jay Berkenbilt <ejb@ql.org>
* 2.0.5: release
2009-03-09 Jay Berkenbilt <ejb@ql.org>
* libqpdf/Pl_LZWDecoder.cc: adjust LZWDecoder full table
detection, now having been able to adequately test boundary
conditions both and with and without early code change. Also
compared implementation with other LZW decoders.
2009-03-08 Jay Berkenbilt <ejb@ql.org>
* qpdf/fix-qdf (write_ostream): Adjust offsets while writing
object streams to account for changes in the length of the
dictionary and offset tables.
* qpdf/qpdf.cc (main): In check mode, in addition to checking
structure of file, attempt to decode all stream data.
* libqpdf/QPDFWriter.cc (QPDFWriter::writeObject): In QDF mode,
write a comment to the QDF file before each object that indicates
the object ID of the corresponding object from the original file.
Add --no-original-object-ids flag to qpdf and
setSuppressOriginalObjectIDs() method to QPDFWriter to turn this
behavior off.
* libqpdf/QPDF.cc (QPDF::pipeStreamData): Issue a warning instead
of failing if there is a problem found while decoding stream.
* qpdf/qpdf.cc: Exit with a status of 3 if warnings were found
regardless of what mode we're in.
2009-02-21 Jay Berkenbilt <ejb@ql.org>
* 2.0.4: release
2009-02-20 Jay Berkenbilt <ejb@ql.org>
* Fix many typos in comments and strings.
* qpdf/qpdf.cc: in --check mode, if there are warnings but no
errors, exit with a status of 3.
* libqpdf/QPDF.cc (QPDF::insertXrefEntry): when recovering the
cross-reference table, have objects we encounter later in the file
supersede those we found earlier. This improves the chances of
being able to recover appended files with damaged cross-reference
tables.
2009-02-19 Jay Berkenbilt <ejb@ql.org>
* libqpdf/Pl_LZWDecoder.cc: correct logic error for previously
untested case of running the LZW decoder without the "early code
change" flag. Thanks to a bug report from "Atom Smasher", I
finally was able to obtain an input stream compressed in this way.
2009-02-15 Jay Berkenbilt <ejb@ql.org>
* 2.0.3: release
2008-12-11 Jay Berkenbilt <ejb@ql.org>
* qpdf/qpdf.cc (main): Accept -help and -version as well as --help
and --version
2008-11-23 Jay Berkenbilt <ejb@ql.org>
* Include stdio.h in a few files for proper compilation with (yet
to be released) gcc 4.4
* updated embedded qtest to version 1.3
* libqpdf/QPDF_String.cc (QPDF_String::getUTF8Val): handle
UTF-16BE properly rather than just treating the string as a string
of 16-bit characters.
2008-06-30 Jay Berkenbilt <ejb@ql.org>
* 2.0.2: release
* updated embedded qtest to version 1.2 (includes previous
changes)
2008-06-07 Jay Berkenbilt <ejb@ql.org>
* qpdf/qtest/qpdf/diff-encrypted: change == to = so that the test
suite passes when /bin/sh is not bash
2008-05-07 Jay Berkenbilt <ejb@ql.org>
* qtest/bin/qtest-driver (run_test): increase timeout for qtest to
be more tolerant of slow machines
2008-05-06 Jay Berkenbilt <ejb@ql.org>
* 2.0.1: release
* make/rules.mk: fix logic with .dep generation for .lo files so
that dependencies work properly with libtool
2008-05-05 Jay Berkenbilt <ejb@ql.org>
* libqpdf/qpdf/MD5.hh: fix header to be 64-bit clean
* configure.ac: add tests for sized integer types
2008-05-04 Jay Berkenbilt <ejb@ql.org>
* libqpdf/QPDF_encryption.cc: do not assume size_t is unsigned int
* qpdf/qtest/qpdf.test: removed locale-specific tests. These were
really to check bugs in perl 5.8.0 and are obsolete now. They
also make the test suite fail in some environments that don't have
all the locales fully configured.
* various: updated several files for gcc 4.3 by adding missing
includes (string.h, stdlib.h)
2008-04-26 Jay Berkenbilt <ejb@ql.org>
* 2.0: initial public release
|