aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2022-05-21JSON: Fix large file supportJay Berkenbilt
2022-05-21Replace std::regex with validators for better performanceJay Berkenbilt
2022-05-20Exercise object description in testsJay Berkenbilt
2022-05-20Add test for bad data and bad datafileJay Berkenbilt
2022-05-20Test --update-from-jsonJay Berkenbilt
2022-05-20Bug fix: don't clobber stream length with replaceDictJay Berkenbilt
2022-05-20JSON: detect duplicate dictionary keys while parsingJay Berkenbilt
2022-05-20Test (and fix) handling of dangling referencesJay Berkenbilt
2022-05-20Bug fix: isReserved() true for indirect reference to reserved objectJay Berkenbilt
2022-05-20Explicitly test ignoring unknown keys in JSON inputJay Berkenbilt
2022-05-20Make version default to latest for --json-output (like --json)Jay Berkenbilt
2022-05-20Round-trip tests with --json-stream-data=fileJay Berkenbilt
2022-05-20Tests with manually constructed qpdf jsonJay Berkenbilt
2022-05-20Add tests for --json-inputJay Berkenbilt
2022-05-20JSON fix: correctly parse UTF-16 surrogate pairsJay Berkenbilt
2022-05-20Add more names and strings in good13Jay Berkenbilt
* native UTF-8 strings * names whose PDF and canonical syntax differ in both dictionary key positions and other positions For json, names are converted both as names and directly when used as dictionary keys.
2022-05-20Rename all test files: _ to -Jay Berkenbilt
2022-05-20Major rework -- see long commentsJay Berkenbilt
* Replace --create-from-json=file with --json-input, which causes the regular input to be treated as json. * Eliminate --to-json * In --json=2, bring back "objects" and eliminate "objectinfo". Stream data is never present. * In --json-output=2, write "qpdf-v2" with "objects" and include stream data.
2022-05-20Add QUtil::FileCloser to the public APIJay Berkenbilt
2022-05-20Support stream data -- not testedJay Berkenbilt
There are no automated tests yet, but committing work so far in preparation for some refactoring.
2022-05-20replaceStreamData: accept uninitialized filter/decode_parmsJay Berkenbilt
These mean to leave the original values alone. This is needed for reconstructing streams from JSON given that the stream data and stream dictionary may appear in any order in the JSON.
2022-05-20Back out fluent QPDFObjectHandle methods. Keep the andGet methods.Jay Berkenbilt
I decided these were confusing and inconsistent with how JSON works. They muddle the API rather than improving it.
2022-05-20Parse objects; stream data is not yet handledJay Berkenbilt
2022-05-20Add new error type for JSONJay Berkenbilt
2022-05-20Add private methods for reserving specific objectsJay Berkenbilt
2022-05-16Implement top-level qpdf json parsingJay Berkenbilt
2022-05-16Add scaffolding for QPDF JSON reactorJay Berkenbilt
2022-05-16Add --create-from-json and --update-from-json argumentsJay Berkenbilt
Also add stubs for top-level QPDF methods (createFromJSON, updateFromJSON)
2022-05-16TODO: solidify work for JSON to PDFJay Berkenbilt
2022-05-16Remove offset from missing /Root errorJay Berkenbilt
The last offset is irrelevant to not being able to find /Root.
2022-05-16Improve handling of replacing stream data with empty stringsJay Berkenbilt
When an empty string was passed to replaceStreamData, the code was passing a null pointer to memcpy. Since a 0 size was also passed, this was harmless, but it triggers sanitizer errors. The code properly handles a null pointer as the buffer in other places.
2022-05-16Add QUtil::is_long_longJay Berkenbilt
2022-05-14Split qpdf.test into multiple test suitesJay Berkenbilt
This makes it a lot easier to run parts of the test suite.
2022-05-14Update qtest-driver to log invalid testsJay Berkenbilt
This is taken from an unrelased change to qtest.
2022-05-14JSON reactor: improve handling of nested containersJay Berkenbilt
Call the parent container's item method before calling the child item's start method so we can easily know the current nesting level when nested items are added.
2022-05-08Add maxobjectid to JSONJay Berkenbilt
2022-05-08TODO note about linux binary distribution runpathJay Berkenbilt
2022-05-08Add --to-json optionJay Berkenbilt
2022-05-08Test inline stream data with different decode levelsJay Berkenbilt
2022-05-08Test json v2 with invalid stream dataJay Berkenbilt
2022-05-08Implement JSON v2 outputJay Berkenbilt
2022-05-08Apply script across future v2 test filesJay Berkenbilt
There is one unexpected pass in this commit. This script was applied to the files changed in this commit: ---------- #!/usr/bin/env python3 import json import sys def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) data['version'] = 2 objectinfo = {} if 'objectinfo' in data: objectinfo = data['objectinfo'] del data['objectinfo'] if 'objects' not in data: continue qpdf = {'jsonversion': 2, 'pdfversion': '1.3', 'objects': {}} for k, v in data['objects'].items(): is_stream = objectinfo.get(k, {}).get('stream', {}).get('is', False) if k.endswith(' R'): k = 'obj:' + k if is_stream: v = {'stream': {'dict': v}} else: v = {'value': v} qpdf['objects'][k] = v data['qpdf'] = qpdf del data['objects'] print(json_dumps(data)) ----------
2022-05-08Prepare test suite for json v2Jay Berkenbilt
2022-05-08Fix typo in json output key nameJay Berkenbilt
moddify -> modify. Also carefully spell checked all remaining keys by splitting them into words and running a spell checker, not just relying on visual proofreading. That was the only one.
2022-05-08Implement JSON v2 for StreamJay Berkenbilt
Not fully exercised in this commit
2022-05-08Implement JSON v2 for StringJay Berkenbilt
Also refine the herustic for deciding whether to use hexadecimal notation for a string.
2022-05-07Prepare code for JSON v2Jay Berkenbilt
Update getJSON() methods and calls to them
2022-05-07Objectinfo json: write incrementally and in numeric orderJay Berkenbilt
This script was used on test data: ---------- #!/usr/bin/env python3 import json import sys import re def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) if 'objectinfo' not in data: continue trailer = None to_sort = [] for k, v in data['objectinfo'].items(): if k == 'trailer': trailer = v else: m = re.match(r'^(\d+) \d+ R', k) if m: to_sort.append([int(m.group(1)), k, v]) newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)} if trailer is not None: newobjectinfo['trailer'] = trailer data['objectinfo'] = newobjectinfo print(json_dumps(data)) ----------
2022-05-07Objects json: write incrementally and in numeric orderJay Berkenbilt
The following script was used to adjust test data: ---------- #!/usr/bin/env python3 import json import sys import re def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) if 'objects' not in data: continue trailer = None to_sort = [] for k, v in data['objects'].items(): if k == 'trailer': trailer = v else: m = re.match(r'^(\d+) \d+ R', k) if m: to_sort.append([int(m.group(1)), k, v]) newobjects = {x[1]: x[2] for x in sorted(to_sort)} if trailer is not None: newobjects['trailer'] = trailer data['objects'] = newobjects print(json_dumps(data)) ----------
2022-05-07Pages json: write each page incrementallyJay Berkenbilt