PDF files which contain structured text (also known as marked content or tagged PDFs), can be processed by JPedal and converted into the ePUB file format.
To convert a tagged PDF to ePUB
final String password = null; //null is used when no password required
final ErrorTracker tracker = null; //ErrorTracker implementations can be used to monitor extraction
final ExtractStructuredTextProperties properties = new ExtractStructuredTextProperties();
properties.setFileOutputMode(OutputModes.EPUB);
properties.setEpubTitle("My EPUB");
ExtractStructuredText.
writeAllStructuredTextOutlinesToDir("inputFileOrFolder", password, "outputFolder", tracker, properties);
ExtractStructuredText.
writeAllStructuredTextOutlinesAndFiguresToDir("inputFileOrFolder", password, "outputFolder", tracker, properties, "figuresFolder", "imageFormatJpegOrPngOnly");
ePUB 3.0 is the only supported version
JPEG and PNG are the only supported image file formats for ePUB