aboutsummaryrefslogtreecommitdiffstats
path: root/manual/cli.rst
blob: e8a07f5f5289c4eda8e4c2d6b34a21c668619d96 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
.. _ref.using:

Running QPDF
============

This chapter describes how to run the qpdf program from the command
line.

.. _ref.invocation:

Basic Invocation
----------------

When running qpdf, the basic invocation is as follows:

::

   qpdf [ options ] { infilename | --empty } outfilename

This converts PDF file :samp:`infilename` to PDF file
:samp:`outfilename`. The output file is functionally
identical to the input file but may have been structurally reorganized.
Also, orphaned objects will be removed from the file. Many
transformations are available as controlled by the options below. In
place of :samp:`infilename`, the parameter
:samp:`--empty` may be specified. This causes qpdf to
use a dummy input file that contains zero pages. The only normal use
case for using :samp:`--empty` would be if you were
going to add pages from another source, as discussed in :ref:`ref.page-selection`.

If :samp:`@filename` appears as a word anywhere in the
command-line, it will be read line by line, and each line will be
treated as a command-line argument. Leading and trailing whitespace is
intentionally not removed from lines, which makes it possible to handle
arguments that start or end with spaces. The :samp:`@-`
option allows arguments to be read from standard input. This allows qpdf
to be invoked with an arbitrary number of arbitrarily long arguments. It
is also very useful for avoiding having to pass passwords on the command
line. Note that the :samp:`@filename` can't appear in
the middle of an argument, so constructs such as
:samp:`--arg=@option` will not work. You would have to
include the argument and its options together in the arguments file.

:samp:`outfilename` does not have to be seekable, even
when generating linearized files. Specifying ":samp:`-`"
as :samp:`outfilename` means to write to standard
output. If you want to overwrite the input file with the output, use the
option :samp:`--replace-input` and omit the output file
name. You can't specify the same file as both the input and the output.
If you do this, qpdf will tell you about the
:samp:`--replace-input` option.

Most options require an output file, but some testing or inspection
commands do not. These are specifically noted.

.. _ref.exit-status:

Exit Status
~~~~~~~~~~~

The exit status of :command:`qpdf` may be interpreted as
follows:

- ``0``: no errors or warnings were found. The file may still have
  problems qpdf can't detect. If
  :samp:`--warning-exit-0` was specified, exit status 0
  is used even if there are warnings.

- ``2``: errors were found. qpdf was not able to fully process the
  file.

- ``3``: qpdf encountered problems that it was able to recover from. In
  some cases, the resulting file may still be damaged. Note that qpdf
  still exits with status ``3`` if it finds warnings even when
  :samp:`--no-warn` is specified. With
  :samp:`--warning-exit-0`, warnings without errors
  exit with status 0 instead of 3.

Note that :command:`qpdf` never exists with status ``1``.
If you get an exit status of ``1``, it was something else, like the
shell not being able to find or execute :command:`qpdf`.

.. _ref.shell-completion:

Shell Completion
----------------

Starting in qpdf version 8.3.0, qpdf provides its own completion support
for zsh and bash. You can enable bash completion with :command:`eval
$(qpdf --completion-bash)` and zsh completion with
:command:`eval $(qpdf --completion-zsh)`. If
:command:`qpdf` is not in your path, you should invoke it
above with an absolute path. If you invoke it with a relative path, it
will warn you, and the completion won't work if you're in a different
directory.

qpdf will use ``argv[0]`` to figure out where its executable is. This
may produce unwanted results in some cases, especially if you are trying
to use completion with copy of qpdf that is built from source. You can
specify a full path to the qpdf you want to use for completion in the
``QPDF_EXECUTABLE`` environment variable.

.. _ref.basic-options:

Basic Options
-------------

The following options are the most common ones and perform commonly
needed transformations.

:samp:`--help`
   Display command-line invocation help.

:samp:`--version`
   Display the current version of qpdf.

:samp:`--copyright`
   Show detailed copyright information.

:samp:`--show-crypto`
   Show a list of available crypto providers, each on a line by itself.
   The default provider is always listed first. See :ref:`ref.crypto` for more information about crypto
   providers.

:samp:`--completion-bash`
   Output a completion command you can eval to enable shell completion
   from bash.

:samp:`--completion-zsh`
   Output a completion command you can eval to enable shell completion
   from zsh.

:samp:`--password={password}`
   Specifies a password for accessing encrypted files. To read the
   password from a file or standard input, you can use
   :samp:`--password-file`, added in qpdf 10.2. Note
   that you can also use :samp:`@filename` or
   :samp:`@-` as described above to put the password in
   a file or pass it via standard input, but you would do so by
   specifying the entire
   :samp:`--password={password}`
   option in the file. Syntax such as
   :samp:`--password=@filename` won't work since
   :samp:`@filename` is not recognized in the middle of
   an argument.

:samp:`--password-file={filename}`
   Reads the first line from the specified file and uses it as the
   password for accessing encrypted files.
   :samp:`{filename}`
   may be ``-`` to read the password from standard input. Note that, in
   this case, the password is echoed and there is no prompt, so use with
   caution.

:samp:`--is-encrypted`
   Silently exit with status 0 if the file is encrypted or status 2 if
   the file is not encrypted. This is useful for shell scripts. Other
   options are ignored if this is given. This option is mutually
   exclusive with :samp:`--requires-password`. Both this
   option and :samp:`--requires-password` exit with
   status 2 for non-encrypted files.

:samp:`--requires-password`
   Silently exit with status 0 if a password (other than as supplied) is
   required. Exit with status 2 if the file is not encrypted. Exit with
   status 3 if the file is encrypted but requires no password or the
   correct password has been supplied. This is useful for shell scripts.
   Note that any supplied password is used when opening the file. When
   used with a :samp:`--password` option, this option
   can be used to check the correctness of the password. In that case,
   an exit status of 3 means the file works with the supplied password.
   This option is mutually exclusive with
   :samp:`--is-encrypted`. Both this option and
   :samp:`--is-encrypted` exit with status 2 for
   non-encrypted files.

:samp:`--verbose`
   Increase verbosity of output. For now, this just prints some
   indication of any file that it creates.

:samp:`--progress`
   Indicate progress while writing files.

:samp:`--no-warn`
   Suppress writing of warnings to stderr. If warnings were detected and
   suppressed, :command:`qpdf` will still exit with exit
   code 3. See also :samp:`--warning-exit-0`.

:samp:`--warning-exit-0`
   If warnings are found but no errors, exit with exit code 0 instead 3.
   When combined with :samp:`--no-warn`, the effect is
   for :command:`qpdf` to completely ignore warnings.

:samp:`--linearize`
   Causes generation of a linearized (web-optimized) output file.

:samp:`--replace-input`
   If specified, the output file name should be omitted. This option
   tells qpdf to replace the input file with the output. It does this by
   writing to
   :file:`{infilename}.~qpdf-temp#`
   and, when done, overwriting the input file with the temporary file.
   If there were any warnings, the original input is saved as
   :file:`{infilename}.~qpdf-orig`.

:samp:`--copy-encryption=file`
   Encrypt the file using the same encryption parameters, including user
   and owner password, as the specified file. Use
   :samp:`--encryption-file-password` to specify a
   password if one is needed to open this file. Note that copying the
   encryption parameters from a file also copies the first half of
   ``/ID`` from the file since this is part of the encryption
   parameters.

:samp:`--encryption-file-password=password`
   If the file specified with :samp:`--copy-encryption`
   requires a password, specify the password using this option. Note
   that only one of the user or owner password is required. Both
   passwords will be preserved since QPDF does not distinguish between
   the two passwords. It is possible to preserve encryption parameters,
   including the owner password, from a file even if you don't know the
   file's owner password.

:samp:`--allow-weak-crypto`
   Starting with version 10.4, qpdf issues warnings when requested to
   create files using RC4 encryption. This option suppresses those
   warnings. In future versions of qpdf, qpdf will refuse to create
   files with weak cryptography when this flag is not given. See :ref:`ref.weak-crypto` for additional details.

:samp:`--encrypt options --`
   Causes generation an encrypted output file. Please see :ref:`ref.encryption-options` for details on how to specify
   encryption parameters.

:samp:`--decrypt`
   Removes any encryption on the file. A password must be supplied if
   the file is password protected.

:samp:`--password-is-hex-key`
   Overrides the usual computation/retrieval of the PDF file's
   encryption key from user/owner password with an explicit
   specification of the encryption key. When this option is specified,
   the argument to the :samp:`--password` option is
   interpreted as a hexadecimal-encoded key value. This only applies to
   the password used to open the main input file. It does not apply to
   other files opened by :samp:`--pages` or other
   options or to files being written.

   Most users will never have a need for this option, and no standard
   viewers support this mode of operation, but it can be useful for
   forensic or investigatory purposes. For example, if a PDF file is
   encrypted with an unknown password, a brute-force attack using the
   key directly is sometimes more efficient than one using the password.
   Also, if a file is heavily damaged, it may be possible to derive the
   encryption key and recover parts of the file using it directly. To
   expose the encryption key used by an encrypted file that you can open
   normally, use the :samp:`--show-encryption-key`
   option.

:samp:`--suppress-password-recovery`
   Ordinarily, qpdf attempts to automatically compensate for passwords
   specified in the wrong character encoding. This option suppresses
   that behavior. Under normal conditions, there are no reasons to use
   this option. See :ref:`ref.unicode-passwords` for a
   discussion

:samp:`--password-mode={mode}`
   This option can be used to fine-tune how qpdf interprets Unicode
   (non-ASCII) password strings passed on the command line. With the
   exception of the :samp:`hex-bytes` mode, these only
   apply to passwords provided when encrypting files. The
   :samp:`hex-bytes` mode also applies to passwords
   specified for reading files. For additional discussion of the
   supported password modes and when you might want to use them, see
   :ref:`ref.unicode-passwords`. The following modes
   are supported:

   - :samp:`auto`: Automatically determine whether the
     specified password is a properly encoded Unicode (UTF-8) string,
     and transcode it as required by the PDF spec based on the type
     encryption being applied. On Windows starting with version 8.4.0,
     and on almost all other modern platforms, incoming passwords will
     be properly encoded in UTF-8, so this is almost always what you
     want.

   - :samp:`unicode`: Tells qpdf that the incoming
     password is UTF-8, overriding whatever its automatic detection
     determines. The only difference between this mode and
     :samp:`auto` is that qpdf will fail with an error
     message if the password is not valid UTF-8 instead of falling back
     to :samp:`bytes` mode with a warning.

   - :samp:`bytes`: Interpret the password as a literal
     byte string. For non-Windows platforms, this is what versions of
     qpdf prior to 8.4.0 did. For Windows platforms, there is no way to
     specify strings of binary data on the command line directly, but
     you can use the :samp:`@filename` option to do it,
     in which case this option forces qpdf to respect the string of
     bytes as provided. This option will allow you to encrypt PDF files
     with passwords that will not be usable by other readers.

   - :samp:`hex-bytes`: Interpret the password as a
     hex-encoded string. This provides a way to pass binary data as a
     password on all platforms including Windows. As with
     :samp:`bytes`, this option may allow creation of
     files that can't be opened by other readers. This mode affects
     qpdf's interpretation of passwords specified for decrypting files
     as well as for encrypting them. It makes it possible to specify
     strings that are encoded in some manner other than the system's
     default encoding.

:samp:`--rotate=[+|-]angle[:page-range]`
   Apply rotation to specified pages. The
   :samp:`page-range` portion of the option value has
   the same format as page ranges in :ref:`ref.page-selection`. If the page range is omitted, the
   rotation is applied to all pages. The :samp:`angle`
   portion of the parameter may be either 0, 90, 180, or 270. If
   preceded by :samp:`+` or :samp:`-`,
   the angle is added to or subtracted from the specified pages'
   original rotations. This is almost always what you want. Otherwise
   the pages' rotations are set to the exact value, which may cause the
   appearances of the pages to be inconsistent, especially for scans.
   For example, the command :command:`qpdf in.pdf out.pdf
   --rotate=+90:2,4,6 --rotate=180:7-8` would rotate pages
   2, 4, and 6 90 degrees clockwise from their original rotation and
   force the rotation of pages 7 through 8 to 180 degrees regardless of
   their original rotation, and the command :command:`qpdf in.pdf
   out.pdf --rotate=+180` would rotate all pages by 180
   degrees.

:samp:`--keep-files-open={[yn]}`
   This option controls whether qpdf keeps individual files open while
   merging. Prior to version 8.1.0, qpdf always kept all files open, but
   this meant that the number of files that could be merged was limited
   by the operating system's open file limit. Version 8.1.0 opened files
   as they were referenced and closed them after each read, but this
   caused a major performance impact. Version 8.2.0 optimized the
   performance but did so in a way that, for local file systems, there
   was a small but unavoidable performance hit, but for networked file
   systems, the performance impact could be very high. Starting with
   version 8.2.1, the default behavior is that files are kept open if no
   more than 200 files are specified, but this default behavior can be
   explicitly overridden with the
   :samp:`--keep-files-open` flag. If you are merging
   more than 200 files but less than the operating system's max open
   files limit, you may want to use
   :samp:`--keep-files-open=y`, especially if working
   over a networked file system. If you are using a local file system
   where the overhead is low and you might sometimes merge more than the
   OS limit's number of files from a script and are not worried about a
   few seconds additional processing time, you may want to specify
   :samp:`--keep-files-open=n`. The threshold for
   switching may be changed from the default 200 with the
   :samp:`--keep-files-open-threshold` option.

:samp:`--keep-files-open-threshold={count}`
   If specified, overrides the default value of 200 used as the
   threshold for qpdf deciding whether or not to keep files open. See
   :samp:`--keep-files-open` for details.

:samp:`--pages options --`
   Select specific pages from one or more input files. See :ref:`ref.page-selection` for details on how to do
   page selection (splitting and merging).

:samp:`--collate={n}`
   When specified, collate rather than concatenate pages from files
   specified with :samp:`--pages`. With a numeric
   argument, collate in groups of :samp:`{n}`.
   The default is 1. See :ref:`ref.page-selection` for additional details.

:samp:`--flatten-rotation`
   For each page that is rotated using the ``/Rotate`` key in the page's
   dictionary, remove the ``/Rotate`` key and implement the identical
   rotation semantics by modifying the page's contents. This option can
   be useful to prepare files for buggy PDF applications that don't
   properly handle rotated pages.

:samp:`--split-pages=[n]`
   Write each group of :samp:`n` pages to a separate
   output file. If :samp:`n` is not specified, create
   single pages. Output file names are generated as follows:

   - If the string ``%d`` appears in the output file name, it is
     replaced with a range of zero-padded page numbers starting from 1.

   - Otherwise, if the output file name ends in
     :file:`.pdf` (case insensitive), a zero-padded
     page range, preceded by a dash, is inserted before the file
     extension.

   - Otherwise, the file name is appended with a zero-padded page range
     preceded by a dash.

   Page ranges are a single number in the case of single-page groups or
   two numbers separated by a dash otherwise. For example, if
   :file:`infile.pdf` has 12 pages

   - :command:`qpdf --split-pages infile.pdf %d-out`
     would generate files :file:`01-out` through
     :file:`12-out`

   - :command:`qpdf --split-pages=2 infile.pdf
     outfile.pdf` would generate files
     :file:`outfile-01-02.pdf` through
     :file:`outfile-11-12.pdf`

   - :command:`qpdf --split-pages infile.pdf
     something.else` would generate files
     :file:`something.else-01` through
     :file:`something.else-12`

   Note that outlines, threads, and other global features of the
   original PDF file are not preserved. For each page of output, this
   option creates an empty PDF and copies a single page from the output
   into it. If you require the global data, you will have to run
   :command:`qpdf` with the
   :samp:`--pages` option once for each file. Using
   :samp:`--split-pages` is much faster if you don't
   require the global data.

:samp:`--overlay options --`
   Overlay pages from another file onto the output pages. See :ref:`ref.overlay-underlay` for details on
   overlay/underlay.

:samp:`--underlay options --`
   Overlay pages from another file onto the output pages. See :ref:`ref.overlay-underlay` for details on
   overlay/underlay.

Password-protected files may be opened by specifying a password. By
default, qpdf will preserve any encryption data associated with a file.
If :samp:`--decrypt` is specified, qpdf will attempt to
remove any encryption information. If :samp:`--encrypt`
is specified, qpdf will replace the document's encryption parameters
with whatever is specified.

Note that qpdf does not obey encryption restrictions already imposed on
the file. Doing so would be meaningless since qpdf can be used to remove
encryption from the file entirely. This functionality is not intended to
be used for bypassing copyright restrictions or other restrictions
placed on files by their producers.

Prior to 8.4.0, in the case of passwords that contain characters that
fall outside of 7-bit US-ASCII, qpdf left the burden of supplying
properly encoded encryption and decryption passwords to the user.
Starting in qpdf 8.4.0, qpdf does this automatically in most cases. For
an in-depth discussion, please see :ref:`ref.unicode-passwords`. Previous versions of this manual
described workarounds using the :command:`iconv` command.
Such workarounds are no longer required or recommended with qpdf 8.4.0.
However, for backward compatibility, qpdf attempts to detect those
workarounds and do the right thing in most cases.

.. _ref.encryption-options:

Encryption Options
------------------

To change the encryption parameters of a file, use the --encrypt flag.
The syntax is

::

   --encrypt user-password owner-password key-length [ restrictions ] --

Note that ":samp:`--`" terminates parsing of encryption
flags and must be present even if no restrictions are present.

Either or both of the user password and the owner password may be empty
strings. Starting in qpdf 10.2, qpdf defaults to not allowing creation
of PDF files with a non-empty user password, an empty owner password,
and a 256-bit key since such files can be opened with no password. If
you want to create such files, specify the encryption option
:samp:`--allow-insecure`, as described below.

The value for
:samp:`{key-length}` may
be 40, 128, or 256. The restriction flags are dependent upon key length.
When no additional restrictions are given, the default is to be fully
permissive.

If :samp:`{key-length}`
is 40, the following restriction options are available:

:samp:`--print=[yn]`
   Determines whether or not to allow printing.

:samp:`--modify=[yn]`
   Determines whether or not to allow document modification.

:samp:`--extract=[yn]`
   Determines whether or not to allow text/image extraction.

:samp:`--annotate=[yn]`
   Determines whether or not to allow comments and form fill-in and
   signing.

If :samp:`{key-length}`
is 128, the following restriction options are available:

:samp:`--accessibility=[yn]`
   Determines whether or not to allow accessibility to visually
   impaired. The qpdf library disregards this field when AES is used or
   when 256-bit encryption is used. You should really never disable
   accessibility, but qpdf lets you do it in case you need to configure
   a file this way for testing purposes. The PDF spec says that
   conforming readers should disregard this permission and always allow
   accessibility.

:samp:`--extract=[yn]`
   Determines whether or not to allow text/graphic extraction.

:samp:`--assemble=[yn]`
   Determines whether document assembly (rotation and reordering of
   pages) is allowed.

:samp:`--annotate=[yn]`
   Determines whether modifying annotations is allowed. This includes
   adding comments and filling in form fields. Also allows editing of
   form fields if :samp:`--modify-other=y` is given.

:samp:`--form=[yn]`
   Determines whether filling form fields is allowed.

:samp:`--modify-other=[yn]`
   Allow all document editing except those controlled separately by the
   :samp:`--assemble`,
   :samp:`--annotate`, and
   :samp:`--form` options.

:samp:`--print={print-opt}`
   Controls printing access.
   :samp:`{print-opt}`
   may be one of the following:

   - :samp:`full`: allow full printing

   - :samp:`low`: allow low-resolution printing only

   - :samp:`none`: disallow printing

:samp:`--modify={modify-opt}`
   Controls modify access. This way of controlling modify access has
   less granularity than new options added in qpdf 8.4.
   :samp:`{modify-opt}`
   may be one of the following:

   - :samp:`all`: allow full document modification

   - :samp:`annotate`: allow comment authoring, form
     operations, and document assembly

   - :samp:`form`: allow form field fill-in and signing
     and document assembly

   - :samp:`assembly`: allow document assembly only

   - :samp:`none`: allow no modifications

   Using the :samp:`--modify` option does not allow you
   to create certain combinations of permissions such as allowing form
   filling but not allowing document assembly. Starting with qpdf 8.4,
   you can either just use the other options to control fields
   individually, or you can use something like :samp:`--modify=form
   --assembly=n` to fine tune.

:samp:`--cleartext-metadata`
   If specified, any metadata stream in the document will be left
   unencrypted even if the rest of the document is encrypted. This also
   forces the PDF version to be at least 1.5.

:samp:`--use-aes=[yn]`
   If :samp:`--use-aes=y` is specified, AES encryption
   will be used instead of RC4 encryption. This forces the PDF version
   to be at least 1.6.

:samp:`--allow-insecure`
   From qpdf 10.2, qpdf defaults to not allowing creation of PDF files
   where the user password is non-empty, the owner password is empty,
   and a 256-bit key is in use. Files created in this way are insecure
   since they can be opened without a password. Users would ordinarily
   never want to create such files. If you are using qpdf to
   intentionally created strange files for testing (a definite valid use
   of qpdf!), this option allows you to create such insecure files.

:samp:`--force-V4`
   Use of this option forces the ``/V`` and ``/R`` parameters in the
   document's encryption dictionary to be set to the value ``4``. As
   qpdf will automatically do this when required, there is no reason to
   ever use this option. It exists primarily for use in testing qpdf
   itself. This option also forces the PDF version to be at least 1.5.

If :samp:`{key-length}`
is 256, the minimum PDF version is 1.7 with extension level 8, and the
AES-based encryption format used is the PDF 2.0 encryption method
supported by Acrobat X. the same options are available as with 128 bits
with the following exceptions:

:samp:`--use-aes`
   This option is not available with 256-bit keys. AES is always used
   with 256-bit encryption keys.

:samp:`--force-V4`
   This option is not available with 256 keys.

:samp:`--force-R5`
   If specified, qpdf sets the minimum version to 1.7 at extension level
   3 and writes the deprecated encryption format used by Acrobat version
   IX. This option should not be used in practice to generate PDF files
   that will be in general use, but it can be useful to generate files
   if you are trying to test proper support in another application for
   PDF files encrypted in this way.

The default for each permission option is to be fully permissive.

.. _ref.page-selection:

Page Selection Options
----------------------

Starting with qpdf 3.0, it is possible to split and merge PDF files by
selecting pages from one or more input files. Whatever file is given as
the primary input file is used as the starting point, but its pages are
replaced with pages as specified.

::

   --pages input-file [ --password=password ] [ page-range ] [ ... ] --

Multiple input files may be specified. Each one is given as the name of
the input file, an optional password (if required to open the file), and
the range of pages. Note that ":samp:`--`" terminates
parsing of page selection flags.

Starting with qpf 8.4, the special input file name
":file:`.`" can be used as a shortcut for the
primary input filename.

For each file that pages should be taken from, specify the file, a
password needed to open the file (if any), and a page range. The
password needs to be given only once per file. If any of the input files
are the same as the primary input file or the file used to copy
encryption parameters (if specified), you do not need to repeat the
password here. The same file can be repeated multiple times. If a file
that is repeated has a password, the password only has to be given the
first time. All non-page data (info, outlines, page numbers, etc.) are
taken from the primary input file. To discard these, use
:samp:`--empty` as the primary input.

Starting with qpdf 5.0.0, it is possible to omit the page range. If qpdf
sees a value in the place where it expects a page range and that value
is not a valid range but is a valid file name, qpdf will implicitly use
the range ``1-z``, meaning that it will include all pages in the file.
This makes it possible to easily combine all pages in a set of files
with a command like :command:`qpdf --empty out.pdf --pages \*.pdf
--`.

The page range is a set of numbers separated by commas, ranges of
numbers separated dashes, or combinations of those. The character "z"
represents the last page. A number preceded by an "r" indicates to count
from the end, so ``r3-r1`` would be the last three pages of the
document. Pages can appear in any order. Ranges can appear with a high
number followed by a low number, which causes the pages to appear in
reverse. Numbers may be repeated in a page range. A page range may be
optionally appended with ``:even`` or ``:odd`` to indicate only the even
or odd pages in the given range. Note that even and odd refer to the
positions within the specified, range, not whether the original number
is even or odd.

Example page ranges:

- ``1,3,5-9,15-12``: pages 1, 3, 5, 6, 7, 8, 9, 15, 14, 13, and 12 in
  that order.

- ``z-1``: all pages in the document in reverse

- ``r3-r1``: the last three pages of the document

- ``r1-r3``: the last three pages of the document in reverse order

- ``1-20:even``: even pages from 2 to 20

- ``5,7-9,12:odd``: pages 5, 8, and, 12, which are the pages in odd
  positions from among the original range, which represents pages 5, 7,
  8, 9, and 12.

Starting in qpdf version 8.3, you can specify the
:samp:`--collate` option. Note that this option is
specified outside of :samp:`--pages ... --`. When
:samp:`--collate` is specified, it changes the meaning
of :samp:`--pages` so that the specified files, as
modified by page ranges, are collated rather than concatenated. For
example, if you add the files :file:`odd.pdf` and
:file:`even.pdf` containing odd and even pages of a
document respectively, you could run :command:`qpdf --collate odd.pdf
--pages odd.pdf even.pdf -- all.pdf` to collate the pages.
This would pick page 1 from odd, page 1 from even, page 2 from odd, page
2 from even, etc. until all pages have been included. Any number of
files and page ranges can be specified. If any file has fewer pages,
that file is just skipped when its pages have all been included. For
example, if you ran :command:`qpdf --collate --empty --pages a.pdf
1-5 b.pdf 6-4 c.pdf r1 -- out.pdf`, you would get the
following pages in this order:

- a.pdf page 1

- b.pdf page 6

- c.pdf last page

- a.pdf page 2

- b.pdf page 5

- a.pdf page 3

- b.pdf page 4

- a.pdf page 4

- a.pdf page 5

Starting in qpdf version 10.2, you may specify a numeric argument to
:samp:`--collate`. With
:samp:`--collate={n}`,
pull groups of :samp:`{n}` pages from each file,
again, stopping when there are no more pages. For example, if you ran
:command:`qpdf --collate=2 --empty --pages a.pdf 1-5 b.pdf 6-4 c.pdf
r1 -- out.pdf`, you would get the following pages in this
order:

- a.pdf page 1

- a.pdf page 2

- b.pdf page 6

- b.pdf page 5

- c.pdf last page

- a.pdf page 3

- a.pdf page 4

- b.pdf page 4

- a.pdf page 5

Starting in qpdf version 8.3, when you split and merge files, any page
labels (page numbers) are preserved in the final file. It is expected
that more document features will be preserved by splitting and merging.
In the mean time, semantics of splitting and merging vary across
features. For example, the document's outlines (bookmarks) point to
actual page objects, so if you select some pages and not others,
bookmarks that point to pages that are in the output file will work, and
remaining bookmarks will not work. A future version of
:command:`qpdf` may do a better job at handling these
issues. (Note that the qpdf library already contains all of the APIs
required in order to implement this in your own application if you need
it.) In the mean time, you can always use
:samp:`--empty` as the primary input file to avoid
copying all of that from the first file. For example, to take pages 1
through 5 from a :file:`infile.pdf` while preserving
all metadata associated with that file, you could use

::

   qpdf infile.pdf --pages . 1-5 -- outfile.pdf

If you wanted pages 1 through 5 from
:file:`infile.pdf` but you wanted the rest of the
metadata to be dropped, you could instead run

::

   qpdf --empty --pages infile.pdf 1-5 -- outfile.pdf

If you wanted to take pages 1 through 5 from
:file:`file1.pdf` and pages 11 through 15 from
:file:`file2.pdf` in reverse, taking document-level
metadata from :file:`file2.pdf`, you would run

::

   qpdf file2.pdf --pages file1.pdf 1-5 . 15-11 -- outfile.pdf

If, for some reason, you wanted to take the first page of an encrypted
file called :file:`encrypted.pdf` with password
``pass`` and repeat it twice in an output file, and if you wanted to
drop document-level metadata but preserve encryption, you would use

::

   qpdf --empty --copy-encryption=encrypted.pdf --encryption-file-password=pass
   --pages encrypted.pdf --password=pass 1 ./encrypted.pdf --password=pass 1 --
   outfile.pdf

Note that we had to specify the password all three times because giving
a password as :samp:`--encryption-file-password` doesn't
count for page selection, and as far as qpdf is concerned,
:file:`encrypted.pdf` and
:file:`./encrypted.pdf` are separated files. These
are all corner cases that most users should hopefully never have to be
bothered with.

Prior to version 8.4, it was not possible to specify the same page from
the same file directly more than once, and the workaround of specifying
the same file in more than one way was required. Version 8.4 removes
this limitation, but there is still a valid use case. When you specify
the same page from the same file more than once, qpdf will share objects
between the pages. If you are going to do further manipulation on the
file and need the two instances of the same original page to be deep
copies, then you can specify the file in two different ways. For example
:command:`qpdf in.pdf --pages . 1 ./in.pdf 1 -- out.pdf`
would create a file with two copies of the first page of the input, and
the two copies would share any objects in common. This includes fonts,
images, and anything else the page references.

.. _ref.overlay-underlay:

Overlay and Underlay Options
----------------------------

Starting with qpdf 8.4, it is possible to overlay or underlay pages from
other files onto the output generated by qpdf. Specify overlay or
underlay as follows:

::

   { --overlay | --underlay } file [ options ] --

Overlay and underlay options are processed late, so they can be combined
with other like merging and will apply to the final output. The
:samp:`--overlay` and :samp:`--underlay`
options work the same way, except underlay pages are drawn underneath
the page to which they are applied, possibly obscured by the original
page, and overlay files are drawn on top of the page to which they are
applied, possibly obscuring the page. You can combine overlay and
underlay.

The default behavior of overlay and underlay is that pages are taken
from the overlay/underlay file in sequence and applied to corresponding
pages in the output until there are no more output pages. If the overlay
or underlay file runs out of pages, remaining output pages are left
alone. This behavior can be modified by options, which are provided
between the :samp:`--overlay` or
:samp:`--underlay` flag and the
:samp:`--` option. The following options are supported:

- :samp:`--password=password`: supply a password if the
  overlay/underlay file is encrypted.

- :samp:`--to=page-range`: a range of pages in the same
  form at described in :ref:`ref.page-selection`
  indicates which pages in the output should have the overlay/underlay
  applied. If not specified, overlay/underlay are applied to all pages.

- :samp:`--from=[page-range]`: a range of pages that
  specifies which pages in the overlay/underlay file will be used for
  overlay or underlay. If not specified, all pages will be used. This
  can be explicitly specified to be empty if
  :samp:`--repeat` is used.

- :samp:`--repeat=page-range`: an optional range of
  pages that specifies which pages in the overlay/underlay file will be
  repeated after the "from" pages are used up. If you want to repeat a
  range of pages starting at the beginning, you can explicitly use
  :samp:`--from=`.

Here are some examples.

- :command:`--overlay o.pdf --to=1-5 --from=1-3 --repeat=4
  --`: overlay the first three pages from file
  :file:`o.pdf` onto the first three pages of the
  output, then overlay page 4 from :file:`o.pdf`
  onto pages 4 and 5 of the output. Leave remaining output pages
  untouched.

- :command:`--underlay footer.pdf --from= --repeat=1,2
  --`: Underlay page 1 of
  :file:`footer.pdf` on all odd output pages, and
  underlay page 2 of :file:`footer.pdf` on all even
  output pages.

.. _ref.attachments:

Embedded Files/Attachments Options
----------------------------------

Starting with qpdf 10.2, you can work with file attachments in PDF files
from the command line. The following options are available:

:samp:`--list-attachments`
   Show the "key" and stream number for embedded files. With
   :samp:`--verbose`, additional information, including
   preferred file name, description, dates, and more are also displayed.
   The key is usually but not always equal to the file name, and is
   needed by some of the other options.

:samp:`--show-attachment={key}`
   Write the contents of the specified attachment to standard output as
   binary data. The key should match one of the keys shown by
   :samp:`--list-attachments`. If specified multiple
   times, only the last attachment will be shown.

:samp:`--add-attachment {file} {options} --`
   Add or replace an attachment with the contents of
   :samp:`{file}`. This may be specified more
   than once. The following additional options may appear before the
   ``--`` that ends this option:

   :samp:`--key={key}`
      The key to use to register the attachment in the embedded files
      table. Defaults to the last path element of
      :samp:`{file}`.

   :samp:`--filename={name}`
      The file name to be used for the attachment. This is what is
      usually displayed to the user and is the name most graphical PDF
      viewers will use when saving a file. It defaults to the last path
      element of :samp:`{file}`.

   :samp:`--creationdate={date}`
      The attachment's creation date in PDF format; defaults to the
      current time. The date format is explained below.

   :samp:`--moddate={date}`
      The attachment's modification date in PDF format; defaults to the
      current time. The date format is explained below.

   :samp:`--mimetype={type/subtype}`
      The mime type for the attachment, e.g. ``text/plain`` or
      ``application/pdf``. Note that the mimetype appears in a field
      called ``/Subtype`` in the PDF but actually includes the full type
      and subtype of the mime type.

   :samp:`--description={"text"}`
      Descriptive text for the attachment, displayed by some PDF
      viewers.

   :samp:`--replace`
      Indicates that any existing attachment with the same key should be
      replaced by the new attachment. Otherwise,
      :command:`qpdf` gives an error if an attachment
      with that key is already present.

:samp:`--remove-attachment={key}`
   Remove the specified attachment. This doesn't only remove the
   attachment from the embedded files table but also clears out the file
   specification. That means that any potential internal links to the
   attachment will be broken. This option may be specified multiple
   times. Run with :samp:`--verbose` to see status of
   the removal.

:samp:`--copy-attachments-from {file} {options} --`
   Copy attachments from another file. This may be specified more than
   once. The following additional options may appear before the ``--``
   that ends this option:

   :samp:`--password={password}`
      If required, the password needed to open
      :samp:`{file}`

   :samp:`--prefix={prefix}`
      Only required if the file from which attachments are being copied
      has attachments with keys that conflict with attachments already
      in the file. In this case, the specified prefix will be prepended
      to each key. This affects only the key in the embedded files
      table, not the file name. The PDF specification doesn't preclude
      multiple attachments having the same file name.

When a date is required, the date should conform to the PDF date format
specification, which is
``D:``\ :samp:`{yyyymmddhhmmss<z>}`, where
:samp:`{<z>}` is either ``Z`` for UTC or a
timezone offset in the form :samp:`{-hh'mm'}` or
:samp:`{+hh'mm'}`. Examples:
``D:20210207161528-05'00'``, ``D:20210207211528Z``.

.. _ref.advanced-parsing:

Advanced Parsing Options
------------------------

These options control aspects of how qpdf reads PDF files. Mostly these
are of use to people who are working with damaged files. There is little
reason to use these options unless you are trying to solve specific
problems. The following options are available:

:samp:`--suppress-recovery`
   Prevents qpdf from attempting to recover damaged files.

:samp:`--ignore-xref-streams`
   Tells qpdf to ignore any cross-reference streams.

Ordinarily, qpdf will attempt to recover from certain types of errors in
PDF files. These include errors in the cross-reference table, certain
types of object numbering errors, and certain types of stream length
errors. Sometimes, qpdf may think it has recovered but may not have
actually recovered, so care should be taken when using this option as
some data loss is possible. The
:samp:`--suppress-recovery` option will prevent qpdf
from attempting recovery. In this case, it will fail on the first error
that it encounters.

Ordinarily, qpdf reads cross-reference streams when they are present in
a PDF file. If :samp:`--ignore-xref-streams` is
specified, qpdf will ignore any cross-reference streams for hybrid PDF
files. The purpose of hybrid files is to make some content available to
viewers that are not aware of cross-reference streams. It is almost
never desirable to ignore them. The only time when you might want to use
this feature is if you are testing creation of hybrid PDF files and wish
to see how a PDF consumer that doesn't understand object and
cross-reference streams would interpret such a file.

.. _ref.advanced-transformation:

Advanced Transformation Options
-------------------------------

These transformation options control fine points of how qpdf creates the
output file. Mostly these are of use only to people who are very
familiar with the PDF file format or who are PDF developers. The
following options are available:

:samp:`--compress-streams={[yn]}`
   By default, or with :samp:`--compress-streams=y`,
   qpdf will compress any stream with no other filters applied to it
   with the ``/FlateDecode`` filter when it writes it. To suppress this
   behavior and preserve uncompressed streams as uncompressed, use
   :samp:`--compress-streams=n`.

:samp:`--decode-level={option}`
   Controls which streams qpdf tries to decode. The default is
   :samp:`generalized`. The following options are
   available:

   - :samp:`none`: do not attempt to decode any streams

   - :samp:`generalized`: decode streams filtered with
     supported generalized filters: ``/LZWDecode``, ``/FlateDecode``,
     ``/ASCII85Decode``, and ``/ASCIIHexDecode``. We define generalized
     filters as those to be used for general-purpose compression or
     encoding, as opposed to filters specifically designed for image
     data. Note that, by default, streams already compressed with
     ``/FlateDecode`` are not uncompressed and recompressed unless you
     also specify :samp:`--recompress-flate`.

   - :samp:`specialized`: in addition to generalized,
     decode streams with supported non-lossy specialized filters;
     currently this is just ``/RunLengthDecode``

   - :samp:`all`: in addition to generalized and
     specialized, decode streams with supported lossy filters;
     currently this is just ``/DCTDecode`` (JPEG)

:samp:`--stream-data={option}`
   Controls transformation of stream data. This option predates the
   :samp:`--compress-streams` and
   :samp:`--decode-level` options. Those options can be
   used to achieve the same affect with more control. The value of
   :samp:`{option}` may
   be one of the following:

   - :samp:`compress`: recompress stream data when
     possible (default); equivalent to
     :samp:`--compress-streams=y`
     :samp:`--decode-level=generalized`. Does not
     recompress streams already compressed with ``/FlateDecode`` unless
     :samp:`--recompress-flate` is also specified.

   - :samp:`preserve`: leave all stream data as is;
     equivalent to :samp:`--compress-streams=n`
     :samp:`--decode-level=none`

   - :samp:`uncompress`: uncompress stream data
     compressed with generalized filters when possible; equivalent to
     :samp:`--compress-streams=n`
     :samp:`--decode-level=generalized`

:samp:`--recompress-flate`
   By default, streams already compressed with ``/FlateDecode`` are left
   alone rather than being uncompressed and recompressed. This option
   causes qpdf to uncompress and recompress the streams. There is a
   significant performance cost to using this option, but you probably
   want to use it if you specify
   :samp:`--compression-level`.

:samp:`--compression-level={level}`
   When writing new streams that are compressed with ``/FlateDecode``,
   use the specified compression level. The value of
   :samp:`level` should be a number from 1 to 9 and is
   passed directly to zlib, which implements deflate compression. Note
   that qpdf doesn't uncompress and recompress streams by default. To
   have this option apply to already compressed streams, you should also
   specify :samp:`--recompress-flate`. If your goal is
   to shrink the size of PDF files, you should also use
   :samp:`--object-streams=generate`.

:samp:`--normalize-content=[yn]`
   Enables or disables normalization of content streams. Content
   normalization is enabled by default in QDF mode. Please see :ref:`ref.qdf` for additional discussion of QDF mode.

:samp:`--object-streams={mode}`
   Controls handling of object streams. The value of
   :samp:`{mode}` may be
   one of the following:

   - :samp:`preserve`: preserve original object streams
     (default)

   - :samp:`disable`: don't write any object streams

   - :samp:`generate`: use object streams wherever
     possible

:samp:`--preserve-unreferenced`
   Tells qpdf to preserve objects that are not referenced when writing
   the file. Ordinarily any object that is not referenced in a traversal
   of the document from the trailer dictionary will be discarded. This
   may be useful in working with some damaged files or inspecting files
   with known unreferenced objects.

   This flag is ignored for linearized files and has the effect of
   causing objects in the new file to be written in order by object ID
   from the original file. This does not mean that object numbers will
   be the same since qpdf may create stream lengths as direct or
   indirect differently from the original file, and the original file
   may have gaps in its numbering.

   See also :samp:`--preserve-unreferenced-resources`,
   which does something completely different.

:samp:`--remove-unreferenced-resources={option}`
   The :samp:`{option}` may be ``auto``,
   ``yes``, or ``no``. The default is ``auto``.

   Starting with qpdf 8.1, when splitting pages, qpdf is able to attempt
   to remove images and fonts that are not used by a page even if they
   are referenced in the page's resources dictionary. When shared
   resources are in use, this behavior can greatly reduce the file sizes
   of split pages, but the analysis is very slow. In versions from 8.1
   through 9.1.1, qpdf did this analysis by default. Starting in qpdf
   10.0.0, if ``auto`` is used, qpdf does a quick analysis of the file
   to determine whether the file is likely to have unreferenced objects
   on pages, a pattern that frequently occurs when resource dictionaries
   are shared across multiple pages and rarely occurs otherwise. If it
   discovers this pattern, then it will attempt to remove unreferenced
   resources. Usually this means you get the slower splitting speed only
   when it's actually going to create smaller files. You can suppress
   removal of unreferenced resources altogether by specifying ``no`` or
   force it to do the full algorithm by specifying ``yes``.

   Other than cases in which you don't care about file size and care a
   lot about runtime, there are few reasons to use this option,
   especially now that ``auto`` mode is supported. One reason to use
   this is if you suspect that qpdf is removing resources it shouldn't
   be removing. If you encounter that case, please report it as bug at
   https://github.com/qpdf/qpdf/issues/.

:samp:`--preserve-unreferenced-resources`
   This is a synonym for
   :samp:`--remove-unreferenced-resources=no`.

   See also :samp:`--preserve-unreferenced`, which does
   something completely different.

:samp:`--newline-before-endstream`
   Tells qpdf to insert a newline before the ``endstream`` keyword, not
   counted in the length, after any stream content even if the last
   character of the stream was a newline. This may result in two
   newlines in some cases. This is a requirement of PDF/A. While qpdf
   doesn't specifically know how to generate PDF/A-compliant PDFs, this
   at least prevents it from removing compliance on already compliant
   files.

:samp:`--linearize-pass1={file}`
   Write the first pass of linearization to the named file. The
   resulting file is not a valid PDF file. This option is useful only
   for debugging ``QPDFWriter``'s linearization code. When qpdf
   linearizes files, it writes the file in two passes, using the first
   pass to calculate sizes and offsets that are required for hint tables
   and the linearization dictionary. Ordinarily, the first pass is
   discarded. This option enables it to be captured.

:samp:`--coalesce-contents`
   When a page's contents are split across multiple streams, this option
   causes qpdf to combine them into a single stream. Use of this option
   is never necessary for ordinary usage, but it can help when working
   with some files in some cases. For example, this can also be combined
   with QDF mode or content normalization to make it easier to look at
   all of a page's contents at once.

:samp:`--flatten-annotations={option}`
   This option collapses annotations into the pages' contents with
   special handling for form fields. Ordinarily, an annotation is
   rendered separately and on top of the page. Combining annotations
   into the page's contents effectively freezes the placement of the
   annotations, making them look right after various page
   transformations. The library functionality backing this option was
   added for the benefit of programs that want to create *n-up* page
   layouts and other similar things that don't work well with
   annotations. The :samp:`{option}` parameter
   may be any of the following:

   - :samp:`all`: include all annotations that are not
     marked invisible or hidden

   - :samp:`print`: only include annotations that
     indicate that they should appear when the page is printed

   - :samp:`screen`: omit annotations that indicate
     they should not appear on the screen

   Note that form fields are special because the annotations that are
   used to render filled-in form fields may become out of date from the
   fields' values if the form is filled in by a program that doesn't
   know how to update the appearances. If qpdf detects this case, its
   default behavior is not to flatten those annotations because doing so
   would cause the value of the form field to be lost. This gives you a
   chance to go back and resave the form with a program that knows how
   to generate appearances. QPDF itself can generate appearances with
   some limitations. See the
   :samp:`--generate-appearances` option below.

:samp:`--generate-appearances`
   If a file contains interactive form fields and indicates that the
   appearances are out of date with the values of the form, this flag
   will regenerate appearances, subject to a few limitations. Note that
   there is not usually a reason to do this, but it can be necessary
   before using the :samp:`--flatten-annotations`
   option. Most of these are not a problem with well-behaved PDF files.
   The limitations are as follows:

   - Radio button and checkbox appearances use the pre-set values in
     the PDF file. QPDF just makes sure that the correct appearance is
     displayed based on the value of the field. This is fine for PDF
     files that create their forms properly. Some PDF writers save
     appearances for fields when they change, which could cause some
     controls to have inconsistent appearances.

   - For text fields and list boxes, any characters that fall outside
     of US-ASCII or, if detected, "Windows ANSI" or "Mac Roman"
     encoding, will be replaced by the ``?`` character.

   - Quadding is ignored. Quadding is used to specify whether the
     contents of a field should be left, center, or right aligned with
     the field.

   - Rich text, multi-line, and other more elaborate formatting
     directives are ignored.

   - There is no support for multi-select fields or signature fields.

   If qpdf doesn't do a good enough job with your form, use an external
   application to save your filled-in form before processing it with
   qpdf.

:samp:`--optimize-images`
   This flag causes qpdf to recompress all images that are not
   compressed with DCT (JPEG) using DCT compression as long as doing so
   decreases the size in bytes of the image data and the image does not
   fall below minimum specified dimensions. Useful information is
   provided when used in combination with
   :samp:`--verbose`. See also the
   :samp:`--oi-min-width`,
   :samp:`--oi-min-height`, and
   :samp:`--oi-min-area` options. By default, starting
   in qpdf 8.4, inline images are converted to regular images and
   optimized as well. Use :samp:`--keep-inline-images`
   to prevent inline images from being included.

:samp:`--oi-min-width={width}`
   Avoid optimizing images whose width is below the specified amount. If
   omitted, the default is 128 pixels. Use 0 for no minimum.

:samp:`--oi-min-height={height}`
   Avoid optimizing images whose height is below the specified amount.
   If omitted, the default is 128 pixels. Use 0 for no minimum.

:samp:`--oi-min-area={area-in-pixels}`
   Avoid optimizing images whose pixel count (width × height) is below
   the specified amount. If omitted, the default is 16,384 pixels. Use 0
   for no minimum.

:samp:`--externalize-inline-images`
   Convert inline images to regular images. By default, images whose
   data is at least 1,024 bytes are converted when this option is
   selected. Use :samp:`--ii-min-bytes` to change the
   size threshold. This option is implicitly selected when
   :samp:`--optimize-images` is selected. Use
   :samp:`--keep-inline-images` to exclude inline images
   from image optimization.

:samp:`--ii-min-bytes={bytes}`
   Avoid converting inline images whose size is below the specified
   minimum size to regular images. If omitted, the default is 1,024
   bytes. Use 0 for no minimum.

:samp:`--keep-inline-images`
   Prevent inline images from being included in image optimization. This
   option has no affect when :samp:`--optimize-images`
   is not specified.

:samp:`--remove-page-labels`
   Remove page labels from the output file.

:samp:`--qdf`
   Turns on QDF mode. For additional information on QDF, please see :ref:`ref.qdf`. Note that :samp:`--linearize`
   disables QDF mode.

:samp:`--min-version={version}`
   Forces the PDF version of the output file to be at least
   :samp:`{version}`. In other words, if the
   input file has a lower version than the specified version, the
   specified version will be used. If the input file has a higher
   version, the input file's original version will be used. It is seldom
   necessary to use this option since qpdf will automatically increase
   the version as needed when adding features that require newer PDF
   readers.

   The version number may be expressed in the form
   :samp:`{major.minor.extension-level}`, in
   which case the version is interpreted as
   :samp:`{major.minor}` at extension level
   :samp:`{extension-level}`. For example,
   version ``1.7.8`` represents version 1.7 at extension level 8. Note
   that minimal syntax checking is done on the command line.

:samp:`--force-version={version}`
   This option forces the PDF version to be the exact version specified
   *even when the file may have content that is not supported in that
   version*. The version number is interpreted in the same way as with
   :samp:`--min-version` so that extension levels can be
   set. In some cases, forcing the output file's PDF version to be lower
   than that of the input file will cause qpdf to disable certain
   features of the document. Specifically, 256-bit keys are disabled if
   the version is less than 1.7 with extension level 8 (except R5 is
   disabled if less than 1.7 with extension level 3), AES encryption is
   disabled if the version is less than 1.6, cleartext metadata and
   object streams are disabled if less than 1.5, 128-bit encryption keys
   are disabled if less than 1.4, and all encryption is disabled if less
   than 1.3. Even with these precautions, qpdf won't be able to do
   things like eliminate use of newer image compression schemes,
   transparency groups, or other features that may have been added in
   more recent versions of PDF.

   As a general rule, with the exception of big structural things like
   the use of object streams or AES encryption, PDF viewers are supposed
   to ignore features in files that they don't support from newer
   versions. This means that forcing the version to a lower version may
   make it possible to open your PDF file with an older version, though
   bear in mind that some of the original document's functionality may
   be lost.

By default, when a stream is encoded using non-lossy filters that qpdf
understands and is not already compressed using a good compression
scheme, qpdf will uncompress and recompress streams. Assuming proper
filter implements, this is safe and generally results in smaller files.
This behavior may also be explicitly requested with
:samp:`--stream-data=compress`.

When :samp:`--normalize-content=y` is specified, qpdf
will attempt to normalize whitespace and newlines in page content
streams. This is generally safe but could, in some cases, cause damage
to the content streams. This option is intended for people who wish to
study PDF content streams or to debug PDF content. You should not use
this for "production" PDF files.

When normalizing content, if qpdf runs into any lexical errors, it will
print a warning indicating that content may be damaged. The only
situation in which qpdf is known to cause damage during content
normalization is when a page's contents are split across multiple
streams and streams are split in the middle of a lexical token such as a
string, name, or inline image. Note that files that do this are invalid
since the PDF specification states that content streams are not to be
split in the middle of a token. If you want to inspect the original
content streams in an uncompressed format, you can always run with
:samp:`--qdf --normalize-content=n` for a QDF file
without content normalization, or alternatively
:samp:`--stream-data=uncompress` for a regular non-QDF
mode file with uncompressed streams. These will both uncompress all the
streams but will not attempt to normalize content. Please note that if
you are using content normalization or QDF mode for the purpose of
manually inspecting files, you don't have to care about this.

Object streams, also known as compressed objects, were introduced into
the PDF specification at version 1.5, corresponding to Acrobat 6. Some
older PDF viewers may not support files with object streams. qpdf can be
used to transform files with object streams to files without object
streams or vice versa. As mentioned above, there are three object stream
modes: :samp:`preserve`,
:samp:`disable`, and :samp:`generate`.

In :samp:`preserve` mode, the relationship to objects
and the streams that contain them is preserved from the original file.
In :samp:`disable` mode, all objects are written as
regular, uncompressed objects. The resulting file should be readable by
older PDF viewers. (Of course, the content of the files may include
features not supported by older viewers, but at least the structure will
be supported.) In :samp:`generate` mode, qpdf will
create its own object streams. This will usually result in more compact
PDF files, though they may not be readable by older viewers. In this
mode, qpdf will also make sure the PDF version number in the header is
at least 1.5.

The :samp:`--qdf` flag turns on QDF mode, which changes
some of the defaults described above. Specifically, in QDF mode, by
default, stream data is uncompressed, content streams are normalized,
and encryption is removed. These defaults can still be overridden by
specifying the appropriate options as described above. Additionally, in
QDF mode, stream lengths are stored as indirect objects, objects are
laid out in a less efficient but more readable fashion, and the
documents are interspersed with comments that make it easier for the
user to find things and also make it possible for
:command:`fix-qdf` to work properly. QDF mode is intended
for people, mostly developers, who wish to inspect or modify PDF files
in a text editor. For details, please see :ref:`ref.qdf`.

.. _ref.testing-options:

Testing, Inspection, and Debugging Options
------------------------------------------

These options can be useful for digging into PDF files or for use in
automated test suites for software that uses the qpdf library. When any
of the options in this section are specified, no output file should be
given. The following options are available:

:samp:`--deterministic-id`
   Causes generation of a deterministic value for /ID. This prevents use
   of timestamp and output file name information in the /ID generation.
   Instead, at some slight additional runtime cost, the /ID field is
   generated to include a digest of the significant parts of the content
   of the output PDF file. This means that a given qpdf operation should
   generate the same /ID each time it is run, which can be useful when
   caching results or for generation of some test data. Use of this flag
   is not compatible with creation of encrypted files.

:samp:`--static-id`
   Causes generation of a fixed value for /ID. This is intended for
   testing only. Never use it for production files. If you are trying to
   get the same /ID each time for a given file and you are not
   generating encrypted files, consider using the
   :samp:`--deterministic-id` option.

:samp:`--static-aes-iv`
   Causes use of a static initialization vector for AES-CBC. This is
   intended for testing only so that output files can be reproducible.
   Never use it for production files. This option in particular is not
   secure since it significantly weakens the encryption.

:samp:`--no-original-object-ids`
   Suppresses inclusion of original object ID comments in QDF files.
   This can be useful when generating QDF files for test purposes,
   particularly when comparing them to determine whether two PDF files
   have identical content.

:samp:`--show-encryption`
   Shows document encryption parameters. Also shows the document's user
   password if the owner password is given.

:samp:`--show-encryption-key`
   When encryption information is being displayed, as when
   :samp:`--check` or
   :samp:`--show-encryption` is given, display the
   computed or retrieved encryption key as a hexadecimal string. This
   value is not ordinarily useful to users, but it can be used as the
   argument to :samp:`--password` if the
   :samp:`--password-is-hex-key` is specified. Note
   that, when PDF files are encrypted, passwords and other metadata are
   used only to compute an encryption key, and the encryption key is
   what is actually used for encryption. This enables retrieval of that
   key.

:samp:`--check-linearization`
   Checks file integrity and linearization status.

:samp:`--show-linearization`
   Checks and displays all data in the linearization hint tables.

:samp:`--show-xref`
   Shows the contents of the cross-reference table in a human-readable
   form. This is especially useful for files with cross-reference
   streams which are stored in a binary format.

:samp:`--show-object=trailer|obj[,gen]`
   Show the contents of the given object. This is especially useful for
   inspecting objects that are inside of object streams (also known as
   "compressed objects").

:samp:`--raw-stream-data`
   When used along with the :samp:`--show-object`
   option, if the object is a stream, shows the raw stream data instead
   of object's contents.

:samp:`--filtered-stream-data`
   When used along with the :samp:`--show-object`
   option, if the object is a stream, shows the filtered stream data
   instead of object's contents. If the stream is filtered using filters
   that qpdf does not support, an error will be issued.

:samp:`--show-npages`
   Prints the number of pages in the input file on a line by itself.
   Since the number of pages appears by itself on a line, this option
   can be useful for scripting if you need to know the number of pages
   in a file.

:samp:`--show-pages`
   Shows the object and generation number for each page dictionary
   object and for each content stream associated with the page. Having
   this information makes it more convenient to inspect objects from a
   particular page.

:samp:`--with-images`
   When used along with :samp:`--show-pages`, also shows
   the object and generation numbers for the image objects on each page.
   (At present, information about images in shared resource dictionaries
   are not output by this command. This is discussed in a comment in the
   source code.)

:samp:`--json`
   Generate a JSON representation of the file. This is described in
   depth in :ref:`ref.json`

:samp:`--json-help`
   Describe the format of the JSON output.

:samp:`--json-key=key`
   This option is repeatable. If specified, only top-level keys
   specified will be included in the JSON output. If not specified, all
   keys will be shown.

:samp:`--json-object=trailer|obj[,gen]`
   This option is repeatable. If specified, only specified objects will
   be shown in the "``objects``" key of the JSON output. If absent, all
   objects will be shown.

:samp:`--check`
   Checks file structure and well as encryption, linearization, and
   encoding of stream data. A file for which
   :samp:`--check` reports no errors may still have
   errors in stream data content but should otherwise be structurally
   sound. If :samp:`--check` any errors, qpdf will exit
   with a status of 2. There are some recoverable conditions that
   :samp:`--check` detects. These are issued as warnings
   instead of errors. If qpdf finds no errors but finds warnings, it
   will exit with a status of 3 (as of version 2.0.4). When
   :samp:`--check` is combined with other options,
   checks are always performed before any other options are processed.
   For erroneous files, :samp:`--check` will cause qpdf
   to attempt to recover, after which other options are effectively
   operating on the recovered file. Combining
   :samp:`--check` with other options in this way can be
   useful for manually recovering severely damaged files. Note that
   :samp:`--check` produces no output to standard output
   when everything is valid, so if you are using this to
   programmatically validate files in bulk, it is safe to run without
   output redirected to :file:`/dev/null` and just
   check for a 0 exit code.

The :samp:`--raw-stream-data` and
:samp:`--filtered-stream-data` options are ignored
unless :samp:`--show-object` is given. Either of these
options will cause the stream data to be written to standard output. In
order to avoid commingling of stream data with other output, it is
recommend that these objects not be combined with other test/inspection
options.

If :samp:`--filtered-stream-data` is given and
:samp:`--normalize-content=y` is also given, qpdf will
attempt to normalize the stream data as if it is a page content stream.
This attempt will be made even if it is not a page content stream, in
which case it will produce unusable results.

.. _ref.unicode-passwords:

Unicode Passwords
-----------------

At the library API level, all methods that perform encryption and
decryption interpret passwords as strings of bytes. It is up to the
caller to ensure that they are appropriately encoded. Starting with qpdf
version 8.4.0, qpdf will attempt to make this easier for you when
interact with qpdf via its command line interface. The PDF specification
requires passwords used to encrypt files with 40-bit or 128-bit
encryption to be encoded with PDF Doc encoding. This encoding is a
single-byte encoding that supports ISO-Latin-1 and a handful of other
commonly used characters. It has a large overlap with Windows ANSI but
is not exactly the same. There is generally not a way to provide PDF Doc
encoded strings on the command line. As such, qpdf versions prior to
8.4.0 would often create PDF files that couldn't be opened with other
software when given a password with non-ASCII characters to encrypt a
file with 40-bit or 128-bit encryption. Starting with qpdf 8.4.0, qpdf
recognizes the encoding of the parameter and transcodes it as needed.
The rest of this section provides the details about exactly how qpdf
behaves. Most users will not need to know this information, but it might
be useful if you have been working around qpdf's old behavior or if you
are using qpdf to generate encrypted files for testing other PDF
software.

A note about Windows: when qpdf builds, it attempts to determine what it
has to do to use ``wmain`` instead of ``main`` on Windows. The ``wmain``
function is an alternative entry point that receives all arguments as
UTF-16-encoded strings. When qpdf starts up this way, it converts all
the strings to UTF-8 encoding and then invokes the regular main. This
means that, as far as qpdf is concerned, it receives its command-line
arguments with UTF-8 encoding, just as it would in any modern Linux or
UNIX environment.

If a file is being encrypted with 40-bit or 128-bit encryption and the
supplied password is not a valid UTF-8 string, qpdf will fall back to
the behavior of interpreting the password as a string of bytes. If you
have old scripts that encrypt files by passing the output of
:command:`iconv` to qpdf, you no longer need to do that,
but if you do, qpdf should still work. The only exception would be for
the extremely unlikely case of a password that is encoded with a
single-byte encoding but also happens to be valid UTF-8. Such a password
would contain strings of even numbers of characters that alternate
between accented letters and symbols. In the extremely unlikely event
that you are intentionally using such passwords and qpdf is thwarting
you by interpreting them as UTF-8, you can use
:samp:`--password-mode=bytes` to suppress qpdf's
automatic behavior.

The :samp:`--password-mode` option, as described earlier
in this chapter, can be used to change qpdf's interpretation of supplied
passwords. There are very few reasons to use this option. One would be
the unlikely case described in the previous paragraph in which the
supplied password happens to be valid UTF-8 but isn't supposed to be
UTF-8. Your best bet would be just to provide the password as a valid
UTF-8 string, but you could also use
:samp:`--password-mode=bytes`. Another reason to use
:samp:`--password-mode=bytes` would be to intentionally
generate PDF files encrypted with passwords that are not properly
encoded. The qpdf test suite does this to generate invalid files for the
purpose of testing its password recovery capability. If you were trying
to create intentionally incorrect files for a similar purposes, the
:samp:`bytes` password mode can enable you to do this.

When qpdf attempts to decrypt a file with a password that contains
non-ASCII characters, it will generate a list of alternative passwords
by attempting to interpret the password as each of a handful of
different coding systems and then transcode them to the required format.
This helps to compensate for the supplied password being given in the
wrong coding system, such as would happen if you used the
:command:`iconv` workaround that was previously needed.
It also generates passwords by doing the reverse operation: translating
from correct in incorrect encoding of the password. This would enable
qpdf to decrypt files using passwords that were improperly encoded by
whatever software encrypted the files, including older versions of qpdf
invoked without properly encoded passwords. The combination of these two
recovery methods should make qpdf transparently open most encrypted
files with the password supplied correctly but in the wrong coding
system. There are no real downsides to this behavior, but if you don't
want qpdf to do this, you can use the
:samp:`--suppress-password-recovery` option. One reason
to do that is to ensure that you know the exact password that was used
to encrypt the file.

With these changes, qpdf now generates compliant passwords in most
cases. There are still some exceptions. In particular, the PDF
specification directs compliant writers to normalize Unicode passwords
and to perform certain transformations on passwords with bidirectional
text. Implementing this functionality requires using a real Unicode
library like ICU. If a client application that uses qpdf wants to do
this, the qpdf library will accept the resulting passwords, but qpdf
will not perform these transformations itself. It is possible that this
will be addressed in a future version of qpdf. The ``QPDFWriter``
methods that enable encryption on the output file accept passwords as
strings of bytes.

Please note that the :samp:`--password-is-hex-key`
option is unrelated to all this. This flag bypasses the normal process
of going from password to encryption string entirely, allowing the raw
encryption key to be specified directly. This is useful for forensic
purposes or for brute-force recovery of files with unknown passwords.