Link
Skip to main content

Does this PDF document contain structured text content?

Maybe, depends on the file.

It is possible to create structured PDF files (tagged PDF) which contain information on the page structure or unstructured PDF files which contain no structural information and the content can be in any order. This happens when the PDF is created, and it is not possible to convert unstructured PDF files into structured PDF files.

You can determine if a PDF file contains structured content by opening the file in Adobe Reader and viewing the Document Properties. There is an advanced field named Tagged PDF. If the value is Yes then the files have structured content.

There is an article on our blog with more information on how to find out if your PDF file contains structured content in Acrobat.

The PdfUtilities class also includes a method to test if a PDF file is fully tagged according to the PDF specification.


Why JPedal?

  • Actively developed commercial library with full support and no third party dependencies.
  • Process PDF files up to 3x faster than alternative Java PDF libraries.
  • Simple licensing options and source code access for OEM users.

Learn more about JPedal

Start Your Free Trial


Customer Downloads

Select Download