Link
Skip to main content

Does this PDF document contain structured text content?

Maybe, depends on the file.

It is possible to create structured PDF files (tagged PDF) which contain information on the page structure or unstructured PDF files which contain no structural information and the content can be in any order. This happens when the PDF is created, and it is not possible to convert unstructured PDF files into structured PDF files.

You can determine if a PDF file contains structured content by opening the file in Adobe Reader and viewing the Document Properties. There is an advanced field named Tagged PDF. If the value is Yes then the files have structured content.

There is an article on our blog with more information on how to find out if your PDF file contains structured content in Acrobat.

The PdfUtilities class also includes a method to test if a PDF file is fully tagged according to the PDF specification.


Get started with JPedal in 3 steps

  1. Fill in the form to download the trial jar →
  2. Copy the code snippets as instructed on the next page
  3. Build your solution using our docs

Learn more about JPedal

Start Your Free Trial


Customer Downloads

Select Download