Extracting Metadata from a PDF on the Command Line

JPedal is able to extract metadata from a PDF as a JSON object to reuse.

All or only part of this data can be extracted in any order from a file using the following command-line command.

java -jar jpedal.jar --metadata filename.pdf [metaDataType]...

Data that can be accessed in this way and the metaDataType are as follows.

Document MetaData fields - fields
MetaData XML - xml
Page size data - pagesizes
Document Bookmarks/Outline - outline
Document font list - fonts
Document page count - pagecount

The valid values for metaDataType are any combination of the above bold values separated by a space character. If you request the same type multiple times for a single command then it will only be output once.

If no value is set, the default option of metaDataType is the full list.

Piping the data

If you wish to save this information for use elsewhere you can use a pipe in batch or bash scripts to pipe the output to a file with the following.

java -jar jpedal.jar --metadata filename.pdf > output.txt

This functionality uses the JPedal PDFUtilities class.

Extracting Metadata from a PDF on the Command Line

Piping the data

More resources

Developer Discord

Zoom Call

Create Ticket

JPedal Licensing