Extract Unclipped Images from PDF
JPedal provides several methods to allow easy extraction of unclipped images (in their raw version or scaled and transformed) that appear with PDF files.
Extract Unclipped Images from PDF with the Command-Line or another language
java -jar jpedal.jar --extractImages "inputFileOrDir" "outputDir" "outputImageFormat"
This command will extract all images present from the PDF or directory of PDFs specified by “inputFileOrDir”. The output is placed in the directory specified by “outputDir”. The output format is specified using “outputImageFormat”.
Extract Unclipped Images from PDF in Java
Static Convenience Method
ExtractImages.writeAllImagesToDir(
"inputFileOrDirectory",
"outputDir",
"outputImageFormat",
true,
false);
API Access Methods
ExtractImages extract = new ExtractImages("C:/pdfs/mypdf.pdf");
//extract.setPassword("password");
if (extract.openPDFFile()) {
int pageCount = extract.getPageCount();
for (int page = 1; page <= pageCount; page++) {
int imagesOnPageCount = extract.getImageCount(page);
for (int image = 0; image < imagesOnPageCount; image++) {
BufferedImage image = extract.getImage(page, image, true);
}
}
}
extract.closePDFfile();
This example uses the JPedal ExtractImages class. ExtractImages can output images in several different image formats including BMP, PNG, JPG, JPG2000 and TIFF.