aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJay Berkenbilt <ejb@ql.org>2022-02-01 13:18:23 +0100
committerJay Berkenbilt <ejb@ql.org>2022-02-01 15:04:55 +0100
commitcc5485dac1f224f856ce48781278b357f61f74bd (patch)
tree097a1b61d7371da9e15d71b6662d16af8f251dd9
parent5a7bb3474eb10ec9dea8409466a14f72ead73e60 (diff)
downloadqpdf-cc5485dac1f224f856ce48781278b357f61f74bd.tar.zst
QPDFJob: documentation
-rw-r--r--README-maintainer30
-rw-r--r--cSpell.json2
-rw-r--r--examples/build.mk4
-rw-r--r--examples/qpdf-job.cc (renamed from examples/pdf-job.cc)0
-rwxr-xr-xgenerate_auto_job260
-rw-r--r--include/qpdf/QPDFJob.hh27
-rw-r--r--job.sums8
-rw-r--r--job.yml7
-rw-r--r--libqpdf/QPDFJob.cc4
-rw-r--r--libqpdf/QPDFJob_config.cc1
-rw-r--r--libqpdf/QPDFJob_json.cc22
-rw-r--r--libqpdf/qpdf/auto_job_help.hh3
-rw-r--r--manual/cli.rst15
-rw-r--r--manual/index.rst1
-rw-r--r--manual/qpdf-job.rst248
-rw-r--r--manual/release-notes.rst6
16 files changed, 589 insertions, 49 deletions
diff --git a/README-maintainer b/README-maintainer
index 7ea049dc..49dc643f 100644
--- a/README-maintainer
+++ b/README-maintainer
@@ -124,14 +124,32 @@ CODING RULES
HOW TO ADD A COMMAND-LINE ARGUMENT
+QPDFJob is documented in three places:
+
+* This section provides a quick reminder for how to add a command-line
+ argument
+
+* generate_auto_job has a detailed explanation about how QPDFJob and
+ generate_auto_job work together
+
+* The manual ("QPDFJob Design" in qpdf-job.rst) discusses the design
+ approach, rationale, and evolution of QPDFJob.
+
Command-line arguments are closely coupled with QPDFJob. To add a new
command-line argument, add the option to the appropriate table in
job.yml. This will automatically declare a method in the private
ArgParser class in QPDFJob_argv.cc which you have to implement. The
-implementation should make calls to methods in QPDFJob. Then, add the
-same option to either the no-json section of job.yml if it is to be
-excluded from the job json structure, or add it under the json
-structure to the place where it should appear in the json structure.
+implementation should make calls to methods in QPDFJob via its Config
+classes. Then, add the same option to either the no-json section of
+job.yml if it is to be excluded from the job json structure, or add it
+under the json structure to the place where it should appear in the
+json structure.
+
+In most cases, adding a new option will automatically declare and call
+the appropriate Config method, which you then have to implement. If
+you need a manual handler, you have to declare the option as manual in
+job.yml and implement the handler yourself, though the automatically
+generated code will declare it for you.
The build will fail until the new option is documented in
manual/cli.rst. To do that, create documentation for the option by
@@ -148,6 +166,10 @@ When done, the following should happen:
* qpdf --help=topic should list --new-option for the correct topic
* --new-option should appear in the manual
* --new-option should be in the command-line option index in the manual
+* A Config method (in Config or one of the other Config classes in
+ QPDFJob) should exist that corresponds to the command-line flag
+* The job JSON file should have a new key in the schema corresponding
+ to the new option
RELEASE PREPARATION
diff --git a/cSpell.json b/cSpell.json
index aacb3051..688c9f1d 100644
--- a/cSpell.json
+++ b/cSpell.json
@@ -100,6 +100,7 @@
"encodable",
"encp",
"endianness",
+ "endl",
"endobj",
"endstream",
"enspliel",
@@ -128,6 +129,7 @@
"fuzzer",
"fuzzers",
"fvisibility",
+ "iostream",
"gajic",
"gajić",
"gcurl",
diff --git a/examples/build.mk b/examples/build.mk
index 5472fba5..b4366c1a 100644
--- a/examples/build.mk
+++ b/examples/build.mk
@@ -8,13 +8,13 @@ BINS_examples = \
pdf-filter-tokens \
pdf-invert-images \
pdf-mod-info \
- pdf-job \
pdf-name-number-tree \
pdf-npages \
pdf-overlay-page \
pdf-parse-content \
pdf-set-form-values \
- pdf-split-pages
+ pdf-split-pages \
+ qpdf-job
CBINS_examples = \
pdf-c-objects \
pdf-linearize
diff --git a/examples/pdf-job.cc b/examples/qpdf-job.cc
index 41ee8603..41ee8603 100644
--- a/examples/pdf-job.cc
+++ b/examples/qpdf-job.cc
diff --git a/generate_auto_job b/generate_auto_job
index 5e1e7e8a..e56c0e60 100755
--- a/generate_auto_job
+++ b/generate_auto_job
@@ -9,6 +9,121 @@ import json
import filecmp
from contextlib import contextmanager
+# The purpose of this code is to automatically generate various parts
+# of the QPDFJob class. It is fairly complicated and extremely
+# bespoke, so understanding it is important if modifications are to be
+# made.
+
+# Documentation of QPDFJob is divided among three places:
+#
+# * "HOW TO ADD A COMMAND-LINE ARGUMENT" in README-maintainer provides
+# a quick reminder for how to add a command-line argument
+#
+# * This file has a detailed explanation about how QPDFJob and
+# generate_auto_job work together
+#
+# * The manual ("QPDFJob Design" in qpdf-job.rst) discusses the design
+# approach, rationale, and evolution of QPDFJob.
+#
+# QPDFJob solved the problem of moving extensive functionality that
+# lived in qpdf.cc into the library. The QPDFJob class consists of
+# four major sections:
+#
+# * The run() method and its subsidiaries are responsible for
+# performing the actual operations on PDF files. This is implemented
+# in QPDFJob.cc
+#
+# * The nested Config class and the other classes it creates provide
+# an API for setting up a QPDFJob instance and correspond to the
+# command-line arguments of the qpdf executable. This is implemented
+# in QPDFJob_config.cc
+#
+# * The argument parsing code reads an argv array and calls
+# configuration methods. This is implemented in QPDFJob_argv.cc. The
+# argument parsing logic itself is implemented in the QPDFArgParser
+# class.
+#
+# * The job JSON handling code, which reads a QPDFJob JSON file and
+# calls configuration methods. This is implemented in
+# QPDFJob_json.cc. The JSON parsing code is in the JSON class. A
+# sax-like JSON handler class that calls callbacks in response to
+# items in the JSON is implemented in the JSONHandler class.
+#
+# This code has the job of ensuring that configuration, command-line
+# arguments, and JSON are all consistent and complete so that a
+# developer or user can freely move among those different ways of
+# interacting with QPDFJob in a predictable fashion. In addition, help
+# information for each option appears in manual/cli.rst, and that
+# information is used in creation of the job JSON schema and to supply
+# help text to QPDFArgParser. This code also ensures that there is an
+# exact match between options in job.yml and options in cli.rst.
+#
+# The job.yml file contains the data that drives this code. To
+# understand job.yml, here are some important concepts.
+#
+# QPDFArgParser option table. There is support for positional
+# arguments, options consisting of flags and optional parameters, and
+# subparsers that start with a regular parameterless flag, have their
+# own positional and option sections, and are terminated with -- by
+# itself. Examples of this include --encrypt and --pages. An "option
+# table" contains an optional positional argument handler and a list
+# of valid options with specifications about their parameters. There
+# are three kinds of option tables:
+#
+# * The built-in "help" option table contains help commands, like
+# --help and --version, that are only valid when they appear as the
+# single command-line argument.
+#
+# * The "main" option table contains the options that are valid
+# starting at the beginning of argument parsing.
+#
+# * A named option table can be started manually by the argument
+# parsing code to switch the argument parser's context. Switching
+# the parser to a new option table is manual (via a call to
+# selectOptionTable). Context reverts to the main option table
+# automatically when -- is encountered.
+#
+# In QPDFJob.hh, there is a Config class for each option table except
+# help.
+#
+# Option type: bare, required/optional parameter, required/optional
+# choices. A bare argument is just a flag, like --qdf. A parameter
+# option takes an arbitrary parameter, like --password. A choices
+# option takes one of a fixed list of choices, like --object-streams.
+# If a parameter or choices option's parameter is option, the empty
+# string may be specified as an option, such as --collate (or
+# --collate=). For a bare option, --option= is always the same as just
+# --option. This makes it possible to switch an option from bare to
+# optional choice to optional parameter all without breaking
+# compatibility.
+#
+# JSON "schema". This is a qpdf-specific "schema" for JSON. It is not
+# related to any kind of standard JSON schema. It is described in
+# JSON.hh and in the manual. QPDFJob uses the JSON "schema" in a mode
+# in which keys in the schema are all optional in the JSON object.
+#
+# Here is the mapping between configuration, argv, and JSON.
+#
+# The help options table is implemented solely for argv processing and
+# has no counterpart in configuration or JSON.
+#
+# The config() method returns a shared pointer to a Config object.
+# Every command-line option in the main option table has a
+# corresponding method in Config whose name is the option converted to
+# camel case. For bare options and options with optional parameters, a
+# version exists that takes no arguments. For others, a version exists
+# that takes a char const*. For example, the --qdf flag implies a
+# qdf() method in Config, and the --object-streams flag implies an
+# objectStreams(char const*) method in Config. For flags in option
+# tables, the method is declared inside a config class specific to the
+# option table. The mapping between option tables and config classes
+# is explicit in job.yml. Positional arguments are handled
+# individually and manually -- see QPDFJob.hh in the CONFIGURATION
+# section for details. See examples/qpdf-job.cc for an example.
+#
+# To understand the rest, start at main and follow comments in the
+# code.
+
whoami = os.path.basename(sys.argv[0])
BANNER = f'''//
// This file is automatically generated by {whoami}.
@@ -33,12 +148,18 @@ def write_file(filename):
class Main:
+ # SOURCES is a list of source files whose contents are used by
+ # this program. If they change, we are out of date.
SOURCES = [
whoami,
'manual/_ext/qpdf.py',
'job.yml',
'manual/cli.rst',
]
+ # DESTS is a map to the output files this code generates. These
+ # generated files, as well as those added to DESTS later in the
+ # code, are included in various places by QPDFJob.hh or any of the
+ # implementing QPDFJob*.cc files.
DESTS = {
'decl': 'libqpdf/qpdf/auto_job_decl.hh',
'init': 'libqpdf/qpdf/auto_job_init.hh',
@@ -48,6 +169,11 @@ class Main:
'json_init': 'libqpdf/qpdf/auto_job_json_init.hh',
# Others are added in top
}
+ # SUBS contains a checksum for each source and destination and is
+ # used to detect whether we're up to date without having to force
+ # recompilation all the time. This way the build can invoke this
+ # script unconditionally without causing stuff to rebuild every
+ # time.
SUMS = 'job.sums'
def main(self, args=sys.argv[1:], prog=whoami):
@@ -71,8 +197,17 @@ class Main:
def top(self, options):
with open('job.yml', 'r') as f:
data = yaml.safe_load(f.read())
+ # config_decls maps a config key from an option in "options"
+ # (from job.yml) to a list of declarations. A declaration is
+ # generated for each config method for that option table.
self.config_decls = {}
+ # Keep track of which configs we've declared since we can have
+ # option tables share a config class, as with the encryption
+ # tables.
self.declared_configs = set()
+
+ # Update DESTS -- see above. This ensures that each config
+ # class's contents are included in job.sums.
for o in data['options']:
config = o.get('config', None)
if config is not None:
@@ -257,12 +392,21 @@ class Main:
def generate(self, data):
warn(f'{whoami}: regenerating auto job files')
self.validate(data)
- # Add the built-in help options to tables that we populate as
- # we read job.yml since we won't encounter these in job.yml
+
+ # Keep track of which options are help options since they are
+ # handled specially. Add the built-in help options to tables
+ # that we populate as we read job.yml since we won't encounter
+ # these in job.yml
self.help_options = set(
['--completion-bash', '--completion-zsh', '--help']
)
+ # Keep track of which options we have encountered but haven't
+ # seen help text for. This enables us to report if any option
+ # is missing help.
self.options_without_help = set(self.help_options)
+
+ # Compute the information needed for generated files and write
+ # the files.
self.prepare(data)
with write_file(self.DESTS['decl']) as f:
print(BANNER, file=f)
@@ -276,6 +420,11 @@ class Main:
with open('manual/cli.rst', 'r') as df:
print(BANNER, file=f)
self.generate_doc(df, f)
+
+ # Compute the json files after the config and arg parsing
+ # files. We need to have full information about all the
+ # options before we can generate the schema. Generating the
+ # schema also generates the json header files.
self.generate_schema(data)
with write_file(self.DESTS['schema']) as f:
print('static constexpr char const* JOB_SCHEMA_DATA = R"(' +
@@ -301,6 +450,9 @@ class Main:
# DON'T ADD CODE TO generate AFTER update_hashes
def handle_trivial(self, i, identifier, cfg, prefix, kind, v):
+ # A "trivial" option is one whose handler does nothing other
+ # than to call the config method with the same name (switched
+ # to camelCase).
decl_arg = 1
decl_arg_optional = False
if kind == 'bare':
@@ -341,11 +493,18 @@ class Main:
# strategy enables us to change an option from bare to
# optional_parameter or optional_choices without
# breaking binary compatibility. The overloaded
- # methods both have to be implemented manually.
+ # methods both have to be implemented manually. They
+ # are not automatically called, so if you forget,
+ # someone will get a link error if they try to call
+ # one.
self.config_decls[cfg].append(
f'QPDF_DLL {config_prefix}* {identifier}();')
def handle_flag(self, i, identifier, kind, v):
+ # For flags that require manual handlers, declare the handler
+ # and register it. They have to be implemented manually in
+ # QPDFJob_argv.cc. You get compiler/linker errors for any
+ # missing methods.
if kind == 'bare':
self.decls.append(f'void {identifier}();')
self.init.append(f'this->ap.addBare("{i}", '
@@ -371,14 +530,17 @@ class Main:
f', false, {v}_choices);')
def prepare(self, data):
- self.decls = []
- self.init = []
- self.json_decls = []
- self.json_init = []
- self.jdata = {}
- self.by_table = {}
+ self.decls = [] # argv handler declarations
+ self.init = [] # initialize arg parsing code
+ self.json_decls = [] # json handler declarations
+ self.json_init = [] # initialize json handlers
+ self.jdata = {} # running data used for json generate
+ self.by_table = {} # table information by name for easy lookup
def add_jdata(flag, table, details):
+ # Keep track of each flag and where it appears so we can
+ # check consistency between the json information and the
+ # options section.
nonlocal self
if table == 'help':
self.help_options.add(f'--{flag}')
@@ -389,6 +551,7 @@ class Main:
'tables': {table: details},
}
+ # helper functions
self.init.append('auto b = [this](void (ArgParser::*f)()) {')
self.init.append(' return QPDFArgParser::bindBare(f, this);')
self.init.append('};')
@@ -396,6 +559,8 @@ class Main:
self.init.append(' return QPDFArgParser::bindParam(f, this);')
self.init.append('};')
self.init.append('')
+
+ # static variables for each set of choices for choices options
for k, v in data['choices'].items():
s = f'static char const* {k}_choices[] = {{'
for i in v:
@@ -406,6 +571,8 @@ class Main:
self.init.append('')
self.json_init.append('')
+ # constants for the table names to reduce hard-coding strings
+ # in the handlers
for o in data['options']:
table = o['table']
if table in ('main', 'help'):
@@ -413,6 +580,20 @@ class Main:
i = self.to_identifier(table, 'O', True)
self.decls.append(f'static constexpr char const* {i} = "{table}";')
self.decls.append('')
+
+ # Walk through all the options adding declarations for the
+ # option handlers and initialization code to register the
+ # handlers in QPDFArgParser. For "trivial" cases,
+ # QPDFArgParser will call the corresponding config method
+ # automatically. Otherwise, it will declare a handler that you
+ # have to explicitly implement.
+
+ # If you add a new option table, you have to set config to the
+ # name of a member variable that you declare in the ArgParser
+ # class in QPDFJob_argv.cc. Then there should be an option in
+ # the main table, also listed as manual in job.yml, that
+ # switches to it. See implementations of any of the existing
+ # options that do this for examples.
for o in data['options']:
table = o['table']
config = o.get('config', None)
@@ -437,8 +618,8 @@ class Main:
self.decls.append(f'void {arg_prefix}Positional(char*);')
self.init.append('this->ap.addPositional('
f'p(&ArgParser::{arg_prefix}Positional));')
- flags = {}
+ flags = {}
for i in o.get('bare', []):
flags[i] = ['bare', None]
for i, v in o.get('required_parameter', {}).items():
@@ -462,6 +643,11 @@ class Main:
self.handle_trivial(
i, identifier, config, config_prefix, kind, v)
+ # Subsidiary options tables need end methods to do any
+ # final checking within the option table. Final checking
+ # for the main option table is handled by
+ # checkConfiguration, which is called explicitly in the
+ # QPDFJob code.
if table not in ('main', 'help'):
identifier = self.to_identifier(table, 'argEnd', False)
self.decls.append(f'void {identifier}();')
@@ -510,6 +696,19 @@ class Main:
return self.option_to_json_key(schema_key)
def build_schema(self, j, path, flag, expected, options_seen):
+ # j: the part of data from "json" in job.yml as we traverse it
+ # path: a string representation of the path in the json
+ # flag: the command-line flag
+ # expected: a map of command-line options we expect to eventually see
+ # options_seen: which options we have seen so far
+
+ # As described in job.yml, the json can have keys that don't
+ # map to options. This includes keys whose values are
+ # dictionaries as well as keys that correspond to positional
+ # arguments. These start with _ and get their help from
+ # job.yml. Things that correspond to options get their help
+ # from the help text we gathered from cli.rst.
+
if flag in expected:
options_seen.add(flag)
elif isinstance(j, str):
@@ -519,6 +718,19 @@ class Main:
elif not (flag == '' or flag.startswith('_')):
raise Exception(f'json: unknown key {flag}')
+ # The logic here is subtle and makes sense if you understand
+ # how our JSON schemas work. They are described in JSON.hh,
+ # but basically, if you see a dictionary, the schema should
+ # have a dictionary with the same keys whose values are
+ # descriptive. If you see an array, the array should have
+ # single member that describes each element of the array. See
+ # JSON.hh for details.
+
+ # See comments in QPDFJob_json.cc in the Handlers class
+ # declaration to understand how and why the methods called
+ # here work. The idea is that Handlers keeps a stack of
+ # JSONHandler shared pointers so that we can register our
+ # handlers in the right place as we go.
if isinstance(j, dict):
schema_value = {}
if flag:
@@ -579,14 +791,20 @@ class Main:
def generate_schema(self, data):
# Check to make sure that every command-line option is
- # represented in data['json'].
-
- # Build a list of options that we expect. If an option appears
- # once, we just expect to see it once. If it appears in more
- # than one options table, we need to see a separate version of
- # it for each option table. It is represented in job.yml
- # prepended with the table prefix. The table prefix is removed
- # in the schema.
+ # represented in data['json']. Build a list of options that we
+ # expect. If an option appears once, we just expect to see it
+ # once. If it appears in more than one options table, we need
+ # to see a separate version of it for each option table. It is
+ # represented in job.yml prepended with the table prefix. The
+ # table prefix is removed in the schema. Example: "password"
+ # appears multiple times, so the json section of job.yml has
+ # main.password, uo.password, etc. But most options appear
+ # only once, so we can just list them as they are. There is a
+ # nearly exact match between option tables and dictionary in
+ # the job json schema, but it's not perfect because of how
+ # positional arguments are handled, so we have to do this
+ # extra work. Information about which tables a particular
+ # option appeared in is gathered up in prepare().
expected = {}
for k, v in self.jdata.items():
tables = v['tables']
@@ -600,7 +818,11 @@ class Main:
# Walk through the json information building the schema as we
# go. This verifies consistency between command-line options
# and the json section of the data and builds up a schema by
- # populating with help information as available.
+ # populating with help information as available. In addition
+ # to generating the schema, we declare and register json
+ # handlers that correspond with it. That way, we can first
+ # check a job JSON file against the schema, and if it matches,
+ # we have fewer error opportunities while calling handlers.
self.schema = self.build_schema(
data['json'], '', '', expected, options_seen)
if options_seen != set(expected.keys()):
diff --git a/include/qpdf/QPDFJob.hh b/include/qpdf/QPDFJob.hh
index 5a8c88cc..64075bc1 100644
--- a/include/qpdf/QPDFJob.hh
+++ b/include/qpdf/QPDFJob.hh
@@ -62,10 +62,10 @@ class QPDFJob
// the regular API. This is exposed in the C API, which makes it
// easier to get certain high-level qpdf functionality from other
// languages. If there are any command-line errors, this method
- // will throw QPDFArgParser::Usage which is derived from
- // std::runtime_error. Other exceptions may be thrown in some
- // cases. Note that argc, and argv should be UTF-8 encoded. If you
- // are calling this from a Windows Unicode-aware main (wmain), see
+ // will throw QPDFUsage which is derived from std::runtime_error.
+ // Other exceptions may be thrown in some cases. Note that argc,
+ // and argv should be UTF-8 encoded. If you are calling this from
+ // a Windows Unicode-aware main (wmain), see
// QUtil::call_main_from_wmain for information about converting
// arguments to UTF-8. This method will mutate arguments that are
// passed to it.
@@ -76,7 +76,7 @@ class QPDFJob
// Initialize a QPDFJob from json. Passing partial = true prevents
// this method from doing the final checks (calling
// checkConfiguration) after processing the json file. This makes
- // it possible to initialze QPDFJob in stages using multiple json
+ // it possible to initialize QPDFJob in stages using multiple json
// files or to have a json file that can be processed from the CLI
// with --job-json-file and be combined with other arguments. For
// example, you might include only encryption parameters, leaving
@@ -84,7 +84,11 @@ class QPDFJob
// input and output files. initializeFromJson is called with
// partial = true when invoked from the command line. To make sure
// that the json file is fully valid on its own, just don't
- // specify any other command-line flags.
+ // specify any other command-line flags. If there are any
+ // configuration errors, QPDFUsage is thrown. Some error messages
+ // may be CLI-centric. If an an exception tells you to use the
+ // "--some-option" option, set the "someOption" key in the JSON
+ // object instead.
QPDF_DLL
void initializeFromJson(std::string const& json, bool partial = false);
@@ -160,7 +164,7 @@ class QPDFJob
// object. The Config object contains methods that correspond with
// qpdf command-line arguments. You can use a fluent interface to
// configure a QPDFJob object that would do exactly the same thing
- // as a specific qpdf command. The example pdf-job.cc contains an
+ // as a specific qpdf command. The example qpdf-job.cc contains an
// example of this usage. You can also use initializeFromJson or
// initializeFromArgv to initialize a QPDFJob object.
@@ -180,6 +184,10 @@ class QPDFJob
// with references. Returning pointers instead of references
// makes for a more uniform interface.
+ // Maintainer documentation: see the section in README-maintainer
+ // called "HOW TO ADD A COMMAND-LINE ARGUMENT", which contains
+ // references to additional places in the documentation.
+
class Config;
class AttConfig
@@ -330,7 +338,10 @@ class QPDFJob
// Return a top-level configuration item. See CONFIGURATION above
// for details. If an invalid configuration is created (such as
// supplying contradictory options, omitting an input file, etc.),
- // QPDFUsage is thrown.
+ // QPDFUsage is thrown. Note that error messages are CLI-centric,
+ // but you can map them into config calls. For example, if an
+ // exception tells you to use the --some-option flag, you should
+ // call config()->someOption() instead.
QPDF_DLL
std::shared_ptr<Config> config();
diff --git a/job.sums b/job.sums
index 0c574cc1..d434c642 100644
--- a/job.sums
+++ b/job.sums
@@ -1,17 +1,17 @@
# Generated by generate_auto_job
-generate_auto_job 1fdb113412a444aad67b0232f3f6c4f50d9e2a5701691e5146fd1b559039ef2e
+generate_auto_job 5d6ec1e4f0b94d8f73df665061d8a2188cbbe8f25ea42be78ec576547261d5ac
include/qpdf/auto_job_c_att.hh 7ad43bb374c1370ef32ebdcdcb7b73a61d281f7f4e3f12755585872ab30fb60e
include/qpdf/auto_job_c_copy_att.hh 32275d03cdc69b703dd7e02ba0bbe15756e714e9ad185484773a6178dc09e1ee
include/qpdf/auto_job_c_enc.hh 72e138c7b96ed5aacdce78c1dec04b1c20d361faec4f8faf52f64c1d6be99265
include/qpdf/auto_job_c_main.hh 69d5ea26098bcb6ec5b5e37ba0bca9e7d16a784d2618e0c05d635046848d5123
include/qpdf/auto_job_c_pages.hh 931840b329a36ca0e41401190e04537b47f2867671a6643bfd8da74014202671
include/qpdf/auto_job_c_uo.hh 0585b7de459fa479d9e51a45fa92de0ff6dee748efc9ec1cedd0dde6cee1ad50
-job.yml effc93a805fb74503be2213ad885238db21991ba3d084fbfeff01183c66cb002
+job.yml 9544c6e046b25d3274731fbcd07ba25b300fd67055021ac4364ad8a91f77c6b6
libqpdf/qpdf/auto_job_decl.hh 9f79396ec459f191be4c5fe34cf88c265cf47355a1a945fa39169d1c94cf04f6
-libqpdf/qpdf/auto_job_help.hh 6002f503368f319a3d717484ac39d1558f34e67989d442f394791f6f6f5f0500
+libqpdf/qpdf/auto_job_help.hh 43184f01816b5210bbc981de8de48446546fb94f4fd6e63cfc7f2fbac3578e6b
libqpdf/qpdf/auto_job_init.hh fd13b9f730e6275a39a15d193bd9af19cf37f4495699ec1886c2b208d7811ab1
libqpdf/qpdf/auto_job_json_decl.hh c5e3fd38a3b0c569eb0c6b4c60953a09cd6bc7d3361a357a81f64fe36af2b0cf
libqpdf/qpdf/auto_job_json_init.hh 3f86ce40931ca8f417d050fcd49104d73c1fa4e977ad19d54b372831a8ea17ed
libqpdf/qpdf/auto_job_schema.hh 18a3780671d95224cb9a27dcac627c421cae509d59f33a63e6bda0ab53cce923
manual/_ext/qpdf.py e9ac9d6c70642a3d29281ee5ad92ae2422dee8be9306fb8a0bc9dba0ed5e28f3
-manual/cli.rst 35289dbf593085016a62249f760cdcad50d5cce76d799ea4acf5dff58b78679a
+manual/cli.rst 3746df6c4f115387cca0d921f25619a6b8407fc10b0e4c9dcf40b0b1656c6f8a
diff --git a/job.yml b/job.yml
index eb6a6b01..eb5b7753 100644
--- a/job.yml
+++ b/job.yml
@@ -1,4 +1,11 @@
# See "HOW TO ADD A COMMAND-LINE ARGUMENT" in README-maintainer.
+
+# REMEMBER: if you add an optional_choices or optional_parameter, you
+# have to explicitly remember to implement the overloaded config
+# method that takes no arguments. Since no generated code will call it
+# automatically, there is no automated reminder to do this. If you
+# forget, it will be a link error if someone tries to call it.
+
choices:
yn:
- "y"
diff --git a/libqpdf/QPDFJob.cc b/libqpdf/QPDFJob.cc
index 1c6a16d6..a06f87bc 100644
--- a/libqpdf/QPDFJob.cc
+++ b/libqpdf/QPDFJob.cc
@@ -646,7 +646,6 @@ QPDFJob::createsOutput() const
void
QPDFJob::checkConfiguration()
{
- // QXXXQ messages are CLI-centric
if (m->replace_input)
{
if (m->outfilename)
@@ -722,7 +721,8 @@ QPDFJob::checkConfiguration()
{
QTC::TC("qpdf", "qpdf same file error");
usage("input file and output file are the same;"
- " use --replace-input to intentionally overwrite the input file");
+ " use --replace-input to intentionally"
+ " overwrite the input file");
}
}
diff --git a/libqpdf/QPDFJob_config.cc b/libqpdf/QPDFJob_config.cc
index fb61924c..68eaf5c8 100644
--- a/libqpdf/QPDFJob_config.cc
+++ b/libqpdf/QPDFJob_config.cc
@@ -28,7 +28,6 @@ QPDFJob::Config::emptyInput()
{
if (o.m->infilename == 0)
{
- // QXXXQ decide whether to fix this or just leave the comment:
// Various places in QPDFJob.cc know that the empty string for
// infile means empty. This means that passing "" as the
// argument to inputFile, or equivalently using "" as a
diff --git a/libqpdf/QPDFJob_json.cc b/libqpdf/QPDFJob_json.cc
index cc4e2ff7..c0de8666 100644
--- a/libqpdf/QPDFJob_json.cc
+++ b/libqpdf/QPDFJob_json.cc
@@ -29,6 +29,28 @@ namespace
typedef std::function<void(char const*)> param_handler_t;
typedef std::function<void(JSON)> json_handler_t;
+ // The code that calls these methods is automatically
+ // generated by generate_auto_job. This describes how we
+ // implement what it does. We keep a stack of handlers in
+ // json_handlers. The top of the stack is the "current" json
+ // handler, intially for the top-level object. Whenever we
+ // encounter a scalar, we add a handler using addBare,
+ // addParameter, or addChoices. Whenever we encounter a
+ // dictionary, we first add the dictionary handlers. Then we
+ // walk into the dictionary and, for each key, we register a
+ // dict key handler and push it to the stack, then do the same
+ // process for the key's value. Then we pop the key handler
+ // off the stack. When we encounter an array, we add the array
+ // handlers, push an item handler to the stack, call
+ // recursively for the array's single item (as this is what is
+ // expected in a schema), and pop the item handler. Note that
+ // we don't pop dictionary start/end handlers. The dictionary
+ // handlers and the key handlers are at the same level in
+ // JSONHandler. This logic is subtle and took several tries to
+ // get right. It's best understood by carefully understanding
+ // the behavior of JSONHandler, the JSON schema, and the code
+ // in generate_auto_job.
+
void addBare(bare_handler_t);
void addParameter(param_handler_t);
void addChoices(char const** choices, bool required, param_handler_t);
diff --git a/libqpdf/qpdf/auto_job_help.hh b/libqpdf/qpdf/auto_job_help.hh
index 49ac3494..38d275b5 100644
--- a/libqpdf/qpdf/auto_job_help.hh
+++ b/libqpdf/qpdf/auto_job_help.hh
@@ -812,7 +812,8 @@ This option is repeatable. If given, only specified objects will
be shown in the "objects" key of the JSON output. Otherwise, all
objects will be shown.
)");
-ap.addOptionHelp("--job-json-help", "json", "show format of job JSON", R"(Describe the format of the QPDFJob JSON input.
+ap.addOptionHelp("--job-json-help", "json", "show format of job JSON", R"(Describe the format of the QPDFJob JSON input used by
+--job-json-file.
)");
ap.addHelpTopic("testing", "options for testing or debugging", R"(The options below are useful when writing automated test code that
includes files created by qpdf or when testing qpdf itself.
diff --git a/manual/cli.rst b/manual/cli.rst
index 7dd955c4..614be80d 100644
--- a/manual/cli.rst
+++ b/manual/cli.rst
@@ -167,9 +167,11 @@ Related Options
description of the JSON input file format.
Specify the name of a file whose contents are expected to contain a
- QPDFJob JSON file. QXXXQ ref. This file is read and treated as if
- the equivalent command-line arguments were supplied. It can be
- mixed freely with other options.
+ QPDFJob JSON file. This file is read and treated as if the
+ equivalent command-line arguments were supplied. It can be repeated
+ and mixed freely with other options. Run ``qpdf`` with
+ :qpdf:ref:`--job-json-help` for a description of the job JSON input
+ file format. For more information, see :ref:`qpdf-job`.
.. _exit-status:
@@ -3200,9 +3202,12 @@ Related Options
.. help: show format of job JSON
- Describe the format of the QPDFJob JSON input.
+ Describe the format of the QPDFJob JSON input used by
+ --job-json-file.
- Describe the format of the QPDFJob JSON input. QXXXQ doc ref.
+ Describe the format of the QPDFJob JSON input used by
+ :qpdf:ref:`--job-json-file`. For more information about QPDFJob,
+ see :ref:`qpdf-job`.
.. _test-options:
diff --git a/manual/index.rst b/manual/index.rst
index 7f8b1483..5aa59346 100644
--- a/manual/index.rst
+++ b/manual/index.rst
@@ -28,6 +28,7 @@ documentation, please visit `https://qpdf.readthedocs.io
weak-crypto
json
design
+ qpdf-job
linearization
object-streams
encryption
diff --git a/manual/qpdf-job.rst b/manual/qpdf-job.rst
new file mode 100644
index 00000000..72e02305
--- /dev/null
+++ b/manual/qpdf-job.rst
@@ -0,0 +1,248 @@
+
+.. _qpdf-job:
+
+QPDFJob: a Job-Based Interface
+==============================
+
+All of the functionality from the :command:`qpdf` command-line
+executable is available from inside the C++ library using the
+``QPDFJob`` class. There are several ways to access this functionality:
+
+- Command-line options
+
+ - Run the :command:`qpdf` command line
+
+ - Use from the C++ API with ``QPDFJob::initializeFromArgv``
+
+ - Use from the C API with QXXXQ
+
+- The job JSON file format
+
+ - Use from the CLI with the :qpdf:ref:`--job-json-file` parameter
+
+ - Use from the C++ API with ``QPDFJob::initializeFromJson``
+
+ - Use from the C API with QXXXQ
+
+- The ``QPDFJob`` C++ API
+
+If you can understand how to use the :command:`qpdf` CLI, you can
+understand the ``QPDFJob`` class and the json file. qpdf guarantees
+that all of the above methods are in sync. Here's how it works:
+
+.. list-table:: QPDFJob Interfaces
+ :widths: 30 30 30
+ :header-rows: 1
+
+ - - CLI
+ - JSON
+ - C++
+
+ - - ``--some-option``
+ - ``"someOption": ""``
+ - ``config()->someOption()``
+
+ - - ``--some-option=value``
+ - ``"someOption": "value"``
+ - ``config()->someOption("value")``
+
+ - - positional argument
+ - ``"otherOption": "value"``
+ - ``config()->otherOption("value")``
+
+In the JSON file, the JSON structure is an object (dictionary) whose
+keys are command-line flags converted to camelCase. Positional
+arguments have some corresponding key, which you can find by running
+``qpdf`` with the :qpdf:ref:`--job-json-help` flag. For example, input
+and output files are named by positional arguments on the CLI. In the
+JSON, they are ``"inputFile"`` and ``"outputFile"``. The following are
+equivalent:
+
+.. It would be nice to have an automated test that these are all the
+ same, but we have so few live examples that it's not worth it for
+ now.
+
+CLI:
+ ::
+
+ qpdf infile.pdf outfile.pdf \
+ --pages . other.pdf --password=x 1-5 -- \
+ --encrypt user owner 256 --print=low -- \
+ --object-streams=generate
+
+Job JSON:
+ .. code-block:: json
+
+ {
+ "inputFile": "infile.pdf",
+ "outputFile": "outfile.pdf",
+ "pages": [
+ {
+ "file": "."
+ },
+ {
+ "file": "other.pdf",
+ "password": "x",
+ "range": "1-5"
+ }
+ ],
+ "encrypt": {
+ "userPassword": "user",
+ "ownerPassword": "owner",
+ "256bit": {
+ "print": "low"
+ }
+ },
+ "objectStreams": "generate"
+ }
+
+C++ code:
+ .. code-block:: c++
+
+ #include <qpdf/QPDFJob.hh>
+ #include <qpdf/QPDFUsage.hh>
+ #include <iostream>
+
+ int main(int argc, char* argv[])
+ {
+ try
+ {
+ QPDFJob j;
+ j.config()
+ ->inputFile("infile.pdf")
+ ->outputFile("outfile.pdf")
+ ->pages()
+ ->pageSpec(".", "1-z")
+ ->pageSpec("other.pdf", "1-5", "x")
+ ->endPages()
+ ->encrypt(256, "user", "owner")
+ ->print("low")
+ ->endEncrypt()
+ ->objectStreams("generate")
+ ->checkConfiguration();
+ j.run();
+ }
+ catch (QPDFUsage& e)
+ {
+ std::cerr << "configuration error: " << e.what() << std::endl;
+ return 2;
+ }
+ catch (std::exception& e)
+ {
+ std::cerr << "other error: " << e.what() << std::endl;
+ return 2;
+ }
+ return 0;
+ }
+
+It is also possible to mix and match command-line options and json
+from the CLI. For example, you could create a file called
+:file:`my-options.json` containing the following:
+
+.. code-block:: json
+
+ {
+ "encrypt": {
+ "userPassword": "",
+ "ownerPassword": "owner",
+ "256bit": {
+ }
+ },
+ "objectStreams": "generate"
+ }
+
+and use it with other options to create 256-bit encrypted (but
+unrestricted) files with object streams while specifying other
+parameters on the command line, such as
+
+::
+
+ qpdf infile.pdf outfile.pdf --job-json-file=my-options.json
+
+.. _qpdfjob-design:
+
+See also :file:`examples/qpdf-job.cc` in the source distribution as
+well as comments in ``QPDFJob.hh``.
+
+
+QPDFJob Design
+--------------
+
+This section describes some of the design rationale and history behind
+``QPDFJob``.
+
+Documentation of ``QPDFJob`` is divided among three places:
+
+- "HOW TO ADD A COMMAND-LINE ARGUMENT" in :file:`README-maintainer`
+ provides a quick reminder for how to add a command-line argument
+
+- The source file :file:`generate_auto_job` has a detailed explanation
+ about how ``QPDFJob`` and ``generate_auto_job`` work together
+
+- This chapter of the manual has other details.
+
+Prior to qpdf version 10.6.0, the qpdf CLI executable had a lot of
+functionality built into the executable that was not callable from the
+library as such. This created a number of problems:
+
+- Some of the logic in :file:`qpdf.cc` was pretty complex, such as
+ image optimization, generating json output, and many of the page
+ manipulations. While those things could all be coded using the C++
+ API, there would be a lot of duplicated code.
+
+- Page splitting and merging will get more complicated over time as
+ qpdf supports a wider range of document-level options. It would be
+ nice to be able to expose this to library users instead of baking it
+ all into the CLI.
+
+- Users of other languages who just wanted an interface to do things
+ that the CLI could do didn't have a good way to do it, such as just
+ handling a library call a set of command-line options or an
+ equivalent JSON object that could be passed in as a string.
+
+- The qpdf CLI itself was almost 8,000 lines of code. It needed to be
+ refactored, cleaned up, and split.
+
+- Exposing a new feature via the command-line required making lots of
+ small edits to lots of small bits of code, and it was easy to forget
+ something. Adding a code generator, while complex in some ways,
+ greatly reduces the chances of error when extending qpdf.
+
+Here are a few notes on some design decisions about QPDFJob and its
+various interfaces.
+
+- Bare command-line options (flags with no parameter) map to config
+ functions that take no options and to json keys whose values are
+ required to be the empty string. The rationale is that we can later
+ change these bare options to options that take an optional parameter
+ without breaking backward compatibility in the CLI or the JSON.
+ Options that take optional parameters generate two config functions:
+ one has no arguments, and one that has a ``char const*`` argument.
+ This means that adding an optional parameter to a previously bare
+ option also doesn't break binary compatibility.
+
+- Adding a new argument to :file:`job.yml` automatically triggers
+ almost everything by declaring and referencing things that you have
+ to implement. This way, once you get the code to compile and link,
+ you know you haven't forgotten anything. There are two tricky cases:
+
+ - If an argument handler has to do something special, like call a
+ nested config method or select an option table, you have to
+ implement it manually. This is discussed in
+ :file:`generate_auto_job`.
+
+ - When you add an option that has optional parameters or choices,
+ both of the handlers described above are declared, but only the
+ one that takes an argument is referenced. You have to remember to
+ implement the one that doesn't take an argument or else people
+ will get a linker error if they try to call it. The assumption is
+ that things with optional parameters started out as bare, so the
+ argument-less version is already there.
+
+- If you have to add a new option that requires its own option table,
+ you will have to do some extra work including adding a new nested
+ Config class, adding a config member variable to ``ArgParser`` in
+ :file:`QPDFJob_argv.cc` and ``Handlers`` in :file:`QPDFJob_json.cc`,
+ and make sure that manually implemented handlers are consistent with
+ each other. It is best under the cases to explicit test cases for
+ all the various ways to get to the option.
diff --git a/manual/release-notes.rst b/manual/release-notes.rst
index 8c2af683..6b5b85f4 100644
--- a/manual/release-notes.rst
+++ b/manual/release-notes.rst
@@ -2303,9 +2303,9 @@ For a detailed list of changes, please see the file
been added to the :command:`qpdf` command-line
tool. See :ref:`page-selection`.
- - Options have been added to the :command:`qpdf`
- command-line tool for copying encryption parameters from another
- file. (QXXXQ Link)
+ - The :qpdf:ref:`--copy-encryption` option have been added to the
+ :command:`qpdf` command-line tool for copying encryption
+ parameters from another file.
- New methods have been added to the ``QPDF`` object for adding and
removing pages. See :ref:`adding-and-remove-pages`.