Link
Skip to main content

Extracting Metadata from a PDF on the Command Line

JPedal is able to extract metadata from a PDF as a JSON object to reuse.

All or only part of this data can be extracted in any order from a file using the following command-line command.

java -jar jpedal.jar --metadata inputFile.pdf [metaDataType]...

Data that can be accessed in this way and the metaDataType are as follows.

  • Document MetaData fields - fields
  • MetaData XML - xml
  • Page size data - pagesizes
  • Document Bookmarks/Outline - outline
  • Document font list - fonts
  • Document page count - pagecount

The valid values for metaDataType are any combination of the above bold values separated by a space character. If you request the same type multiple times for a single command then it will only be output once.

If no value is set, the default option of metaDataType is the full list.

Piping the data

If you wish to save this information for use elsewhere you can use a pipe in batch or bash scripts to pipe the output to a file with the following.

java -jar jpedal.jar --metadata inputFile.pdf > outputFile.txt

This functionality uses the JPedal PDFUtilities class.


Get started with JPedal in 3 steps

  1. Fill in the form to download the trial jar →
  2. Copy the code snippets as instructed on the next page
  3. Build your solution using our docs

Learn more about JPedal

Start Your Free Trial


Customer Downloads

Select Download