aboutsummaryrefslogtreecommitdiffstats
path: root/libqpdf/QPDF.cc
diff options
context:
space:
mode:
authorJay Berkenbilt <ejb@ql.org>2020-10-31 16:57:28 +0100
committerJay Berkenbilt <ejb@ql.org>2020-10-31 17:14:26 +0100
commitffe6af6f77036d9c725ce906df6020e4b5cac58d (patch)
tree1deacf94c3120628d3045d8748c21fdfd0bf742e /libqpdf/QPDF.cc
parent96767fb104589ee1152152edc803b5f979a8390f (diff)
downloadqpdf-ffe6af6f77036d9c725ce906df6020e4b5cac58d.tar.zst
Add comments explaining the foreign object copying code
These are the comments I would have liked to have been able to read while fixing #449 and #478.
Diffstat (limited to 'libqpdf/QPDF.cc')
-rw-r--r--libqpdf/QPDF.cc47
1 files changed, 44 insertions, 3 deletions
diff --git a/libqpdf/QPDF.cc b/libqpdf/QPDF.cc
index ece80668..73749693 100644
--- a/libqpdf/QPDF.cc
+++ b/libqpdf/QPDF.cc
@@ -2253,9 +2253,50 @@ QPDF::replaceReserved(QPDFObjectHandle reserved,
QPDFObjectHandle
QPDF::copyForeignObject(QPDFObjectHandle foreign)
{
- // Do not preclude use of copyForeignObject on page objects. It is
- // a documented use case to copy pages this way if the intention
- // is to not update the pages tree.
+ // Here's an explanation of what's going on here.
+ //
+ // A QPDFObjectHandle that is an indirect object has an owning
+ // QPDF. The object ID and generation refers to an object in the
+ // owning QPDF. When we copy the QPDFObjectHandle from a foreign
+ // QPDF into the local QPDF, we have to replace all indirect
+ // object references with references to the corresponding object
+ // in the local file.
+ //
+ // To do this, we maintain mappings from foreign object IDs to
+ // local object IDs for each foreign QPDF that we are copying
+ // from. The mapping is stored in an ObjCopier, which contains a
+ // mapping from the foreign ObjGen to the local QPDFObjectHandle.
+ //
+ // To copy, we do a deep traversal of the foreign object with loop
+ // detection to discover all indirect objects that are
+ // encountered, stopping at page boundaries. Whenever we encounter
+ // an indirect object, we check to see if we have already created
+ // a local copy of it. If not, we allocate a "reserved" object
+ // (or, for a stream, just a new stream) and store in the map the
+ // mapping from the foreign object ID to the new object. While we
+ // do this, we keep a list of objects to copy.
+ //
+ // Once we are done with the traversal, we copy all the objects
+ // that we need to copy. However, the copies will contain indirect
+ // object IDs that refer to objects in the foreign file. We need
+ // to replace them with references to objects in the local file.
+ // This is what replaceForeignIndirectObjects does. Once we have
+ // created a copy of the foreign object with all the indirect
+ // references replaced with new ones in the local context, we can
+ // replace the local reserved object with the copy. This mechanism
+ // allows us to copy objects with circular references in any
+ // order.
+
+ // For streams, rather than copying the objects, we set up the
+ // stream data to pull from the original stream by using a stream
+ // data provider. This is done in a manner that doesn't require
+ // the original QPDF object but may require the original source of
+ // the stream data with special handling for immediate_copy_from.
+ // This logic is also in replaceForeignIndirectObjects.
+
+ // Note that we explicitly allow use of copyForeignObject on page
+ // objects. It is a documented use case to copy pages this way if
+ // the intention is to not update the pages tree.
if (! foreign.isIndirect())
{
QTC::TC("qpdf", "QPDF copyForeign direct");