Link
Skip to main content

Extract structured text as ePUB

v2025.04

PDF files which contain structured text (also known as marked content or tagged PDFs), can be processed by JPedal and converted into the ePUB file format.

To convert a tagged PDF to ePUB

final String password = null; //null is used when no password required
final ErrorTracker tracker = null; //ErrorTracker implementations can be used to monitor extraction
final ExtractStructuredTextProperties properties = new ExtractStructuredTextProperties();
properties.setFileOutputMode(OutputModes.EPUB);
properties.setEpubTitle("My EPUB");

ExtractStructuredText.
        writeAllStructuredTextOutlinesToDir("inputFileOrFolder", password, "outputFolder", tracker, properties);

ExtractStructuredText.
        writeAllStructuredTextOutlinesAndFiguresToDir("inputFileOrFolder", password, "outputFolder", tracker, properties, "figuresFolder", "imageFormatJpegOrPngOnly");

ePUB 3.0 is the only supported version

JPEG and PNG are the only supported image file formats for ePUB


Why JPedal?

  • Actively developed commercial library with full support and no third party dependencies.
  • Process PDF files up to 3x faster than alternative Java PDF libraries.
  • Simple licensing options and source code access for OEM users.

Learn more about JPedal

Start Your Free Trial


Customer Downloads

Select Download