Link

Validating PDFs

We are sometimes asked if it’s possible to validate PDF files, for example as a method to determine if there will be errors in the output.

Whilst it is possible to validate that a PDF file conforms to the PDF specification, for example using the Arlington PDF Model, this is primarily intended for software that produces PDF files rather than software that consumes them.

PDF is a complex format, and we have encountered a lot of “creative” interpretations of the PDF specification over the years. If it works in Adobe Reader then most people assume the PDF file is valid (even when it’s not). For software that consumes PDF files, it’s important to handle issues gracefully.

It is very similar situation to HTML, where the majority of web pages would likely fail if you validate them, but they are still rendered by web browsers in an entirely acceptable manner. A large proportion of PDF files would also fail validation but are still rendered well.

BuildVu makes a best effort attempt at converting PDF documents. If there is a critical error that prevents conversion from continuing, then BuildVu will throw a PdfException (if running from Java) and set a non-zero exit status (if running from command line).

When we receive bug reports, most of the time we are able to add a workaround for the issue when it is caused by not following the PDF spec as we expect.


What's included in your BuildVu trial?

  • Access to download the SDK and run it locally.
  • Access to the cloud trial to convert documents in the IDR cloud.
  • Access to the Docker image to set up your own trial server in the cloud.
  • Communicate with IDR developers to ask questions & get expert advice.
  • Plenty of time to experiment and build a proof of concept.
  • Over 100 articles to help you get started and learn about BuildVu.
  • An exceptional PDF to HTML converter that took over 20 years to build!

Start Your Free Trial