Link

Metadata JSON API

BuildVu writes out a range of metadata which includes:

  • data about the document (such as the page count and page bounds)
  • data from the PDF file (such as the title, author, subject, bookmarks & page labels)
  • data about the output (such as page and thumbnail types).

In IDRViewer modes this information can be found in the config.js file, and in content mode it is stored within the properties.json file, both of which are stored at the base level of the output directory.

Annotations can be found in a different file (annotations.json) and are described here.

List of Entries

  1. pagecount
  2. fileName
  3. bounds
  4. bookmarks
  5. thumbnailType
  6. pageType
  7. pageLabels
  8. title
  9. author
  10. subject
  11. keywords
  12. creator
  13. producer
  14. creationdate
  15. moddate

pagecount

Type: integer

Value: The number of pages in the document, e.g 15


fileName

Type: text string

Value: The filename of the PDF being converted, e.g. “my-document.pdf”


bounds

Type: array

Value: The width and height of the pages, e.g.

[[909,1286],[909,1286],[909,1286]]

bookmarks

Type: array

Value: An array containing the bookmarks from the PDF file (described below).


thumbnailType

Type: text string

Value: The file type of thumbnails in the /thumbnails directory (if generated). E.g. “jpg” or “png”


pageType

Type: text string

Value: The type of the pages generated. E.g. “html”, “svg” or “svgz”


pageLabels

Type: array of text string

Value: If the PDF document does not specify page labels this will be an empty array, otherwise this will be an array of length pagecount which contains the page labels, e.g.

["i", "ii", "iii", "iv", "1", "2", "3"]

or

["Cover", "1", "2", "3"]

title

Type: text string

Value: The PDF document’s title (if provided).


author

Type: text string

Value: The name of the person who created the PDF document (if provided).


subject

Type: text string

Value: The subject of the PDF document (if provided).


keywords

Type: text string

Value: Keywords associated with the PDF document (if provided).


creator

Type: text string

Value: If the document was converted to PDF from another format, the name of the conforming product that created the original document from which it was converted (if provided).


producer

Type: text string

Value: If the document was converted to PDF from another format, the name of the conforming product that converted it to PDF (if provided).


creationdate

Type: date

Value: The date and time the PDF document was created, in human-readable form (if provided).


moddate

Type: date

Value: The date and time the PDF document was most recently modified, in human-readable form (if provided).


Bookmarks

Bookmarks (or outlines) act like a contents list for the document and may be nested recursively. They link a text string to a page, and optionally may scroll to a given offset on the page.

Note that coordinates are bottom-left origin.

EXAMPLE

[
    {
    "title": "1 Scope",
    "page": 9,
    "zoom": "XYZ 108 974 null"
  }, {
    "title": "2 Conformance",
    "page": 9,
    "zoom": "XYZ 108 577 null",
    "children": [{
        "title": "2.1 General",
        "page": 9,
        "zoom": "XYZ 108 534 null"
      }, {
        "title": "2.2 Conforming readers",
        "page": 9,
        "zoom": "XYZ 108 388 null"
      }
    ]
  }, {
    "title": "3 Normative references",
    "page": 10,
    "zoom": "XYZ 55 1035 null"
  }, {
    "title": "4 Terms and definitions",
    "page": 14,
    "zoom": "XYZ 55 1162 null"
  }, {
    "title": "5 Notation",
    "page": 18,
    "zoom": ""
  }
]

Dates

A date shall be a text string of the form

(D:YYYYMMDDHHmmSSOHH'mm)

where:

  • YYYY shall be the year
  • MM shall be the month (01–12)
  • DD shall be the day (01–31)
  • HH shall be the hour (00–23)
  • mm shall be the minute (00–59)
  • SS shall be the second (00–59)
  • O shall be the relationship of local time to Universal Time (UT), and shall be denoted by one of the characters PLUS SIGN (U+002B) (+), HYPHEN-MINUS (U+002D) (-), or LATIN CAPITAL LETTER Z (U+005A) (Z) (see below)
  • HH followed by APOSTROPHE (U+0027) (‘) shall be the absolute value of the offset from UT in hours (00–23)
  • mm shall be the absolute value of the offset from UT in minutes (00–59)

The prefix D: shall be present, the year field (YYYY) shall be present and all other fields may be present but only if all of their preceding fields are also present. The APOSTROPHE following the hour offset field (HH) shall only be present if the HH field is present. The minute offset field (mm) shall only be present if the APOSTROPHE following the hour offset field (HH) is present. The default values for MM and DD shall be both 01; all other numerical fields shall default to zero values. A PLUS SIGN as the value of the O field signifies that local time is later than UT, a HYPHEN-MINUS signifies that local time is earlier than UT, and the LATIN CAPITAL LETTER Z signifies that local time is equal to UT. If no UT information is specified, the relationship of the specified time to UT shall be considered to be GMT. Regardless of whether the time zone is specified, the rest of the date shall be specified in local time.

EXAMPLE

For example, December 23, 1998, at 7:52 PM, U.S. Pacific Standard Time, is represented by the string D:199812231952-08’00


What's included in your BuildVu trial?

  • Access to download the SDK and run it locally.
  • Access to the cloud trial to convert documents in the IDR cloud.
  • Access to the Docker image to set up your own trial server in the cloud.
  • Communicate with IDR developers to ask questions & get expert advice.
  • Plenty of time to experiment and build a proof of concept.
  • Over 100 articles to help you get started and learn about BuildVu.
  • An exceptional PDF to HTML converter that took over 20 years to build!

Start Your Free Trial