Metadata JSON API
BuildVu writes out a range of metadata which includes:
- data about the document (such as the page count and page bounds)
- data from the PDF file (such as the title, author, subject, bookmarks & page labels)
- data about the output (such as page and thumbnail types).
In IDRViewer modes this information can be found in the config.js file, and in content mode it is stored within the properties.json file, both of which are stored at the base level of the output directory.
Annotations can be found in a different file (annotations.json) and are described here.
List of Entries
- pagecount
- fileName
- bounds
- bookmarks
- thumbnailType
- pageType
- pageLabels
- title
- author
- subject
- keywords
- creator
- producer
- creationdate
- moddate
pagecount
Type: integer
Value: The number of pages in the document, e.g 15
fileName
Type: text string
Value: The filename of the PDF being converted, e.g. “my-document.pdf”
bounds
Type: array
Value: The width and height of the pages, e.g.
[[909,1286],[909,1286],[909,1286]]
bookmarks
Type: array
Value: An array containing the bookmarks from the PDF file (described below).
thumbnailType
Type: text string
Value: The file type of thumbnails in the /thumbnails directory (if generated). E.g. “jpg” or “png”
pageType
Type: text string
Value: The type of the pages generated. E.g. “html”, “svg” or “svgz”
pageLabels
Type: array of text string
Value: If the PDF document does not specify page labels this will be an empty array, otherwise this will be an array of length pagecount which contains the page labels, e.g.
["i", "ii", "iii", "iv", "1", "2", "3"]
or
["Cover", "1", "2", "3"]
title
Type: text string
Value: The PDF document’s title (if provided).
author
Type: text string
Value: The name of the person who created the PDF document (if provided).
subject
Type: text string
Value: The subject of the PDF document (if provided).
keywords
Type: text string
Value: Keywords associated with the PDF document (if provided).
creator
Type: text string
Value: If the document was converted to PDF from another format, the name of the conforming product that created the original document from which it was converted (if provided).
producer
Type: text string
Value: If the document was converted to PDF from another format, the name of the conforming product that converted it to PDF (if provided).
creationdate
Type: date
Value: The date and time the PDF document was created, in human-readable form (if provided).
moddate
Type: date
Value: The date and time the PDF document was most recently modified, in human-readable form (if provided).
Bookmarks
Bookmarks (or outlines) act like a contents list for the document and may be nested recursively. They link a text string to a page, and optionally may scroll to a given offset on the page.
Note that coordinates are bottom-left origin.
EXAMPLE
[
{
"title": "1 Scope",
"page": 9,
"zoom": "XYZ 108 974 null"
}, {
"title": "2 Conformance",
"page": 9,
"zoom": "XYZ 108 577 null",
"children": [{
"title": "2.1 General",
"page": 9,
"zoom": "XYZ 108 534 null"
}, {
"title": "2.2 Conforming readers",
"page": 9,
"zoom": "XYZ 108 388 null"
}
]
}, {
"title": "3 Normative references",
"page": 10,
"zoom": "XYZ 55 1035 null"
}, {
"title": "4 Terms and definitions",
"page": 14,
"zoom": "XYZ 55 1162 null"
}, {
"title": "5 Notation",
"page": 18,
"zoom": ""
}
]
Dates
A date shall be a text string of the form
(D:YYYYMMDDHHmmSSOHH'mm)
where:
- YYYY shall be the year
- MM shall be the month (01–12)
- DD shall be the day (01–31)
- HH shall be the hour (00–23)
- mm shall be the minute (00–59)
- SS shall be the second (00–59)
- O shall be the relationship of local time to Universal Time (UT), and shall be denoted by one of the characters PLUS SIGN (U+002B) (+), HYPHEN-MINUS (U+002D) (-), or LATIN CAPITAL LETTER Z (U+005A) (Z) (see below)
- HH followed by APOSTROPHE (U+0027) (‘) shall be the absolute value of the offset from UT in hours (00–23)
- mm shall be the absolute value of the offset from UT in minutes (00–59)
The prefix D: shall be present, the year field (YYYY) shall be present and all other fields may be present but only if all of their preceding fields are also present. The APOSTROPHE following the hour offset field (HH) shall only be present if the HH field is present. The minute offset field (mm) shall only be present if the APOSTROPHE following the hour offset field (HH) is present. The default values for MM and DD shall be both 01; all other numerical fields shall default to zero values. A PLUS SIGN as the value of the O field signifies that local time is later than UT, a HYPHEN-MINUS signifies that local time is earlier than UT, and the LATIN CAPITAL LETTER Z signifies that local time is equal to UT. If no UT information is specified, the relationship of the specified time to UT shall be considered to be GMT. Regardless of whether the time zone is specified, the rest of the date shall be specified in local time.
EXAMPLE
For example, December 23, 1998, at 7:52 PM, U.S. Pacific Standard Time, is represented by the string D:199812231952-08’00