Link
Skip to main content

Extract document outline from any PDF file

JPedal provides several methods to extract textual content from a PDF file. A PDF file can contain an optional Document outline object. This is a table of contents that can include titles, and links pages with control over zoom and exact area to display. If this is present, this code will extract the outline data object to an XML file. In this case, we can extract the documents outline from a file. If there is no outline, no file is created.

Extract Outline from PDF from Command Line or another language

java -jar jpedal.jar --metadata "inputFile.pdf" outline

This will output the outline data to the console as a JSON object string.

Example to access API methods

ExtractOutline extract = new ExtractOutline("inputFile.pdf");
//extract.setPassword("password");
if (extract.openPDFFile()) {
    Document pdfOutline = extract.getPDFTextOutline();
}
extract.closePDFfile();

Extract Outline from PDF in Java

ExtractOutline.writeAllOutlinesToDir("inputFileOrFolder", "outputFolder");

This example uses the JPedal ExtractOutline class. ExtractOutline outputs an XML file per PDF containing various details about the outline entries such as its title, page and initial zoom level.


Get started with JPedal in 3 steps

  1. Fill in the form to download the trial jar →
  2. Copy the code snippets as instructed on the next page
  3. Build your solution using our docs

Learn more about JPedal

Start Your Free Trial


Customer Downloads

Select Download