Extract Unclipped Images from PDF
JPedal provides several methods to allow easy extraction of images from a PDF document. This example provides the raw images used on the page before it is clipped, rotated, and scaled for the page. If you want the images as they appear on the page you can use ExtractClippedImages.
Extract Unclipped Images from PDF with the Command-Line or another language
java -jar jpedal.jar --extractImages "inputFileOrDir" "outputDir" "outputImageFormat"
This command will extract all images present from the PDF or directory of PDFs specified by “inputFileOrDir”. The output is placed in the directory specified by “outputDir”. The output format is specified using “outputImageFormat”.
Extract Unclipped Images from PDF in Java
Static Convenience Method
ExtractImages.writeAllImagesToDir(
"inputFileOrDirectory",
"outputDir",
"outputImageFormat",
true,
false);
API Access Methods
ExtractImages extract = new ExtractImages("C:/pdfs/mypdf.pdf");
//extract.setPassword("password");
if (extract.openPDFFile()) {
int pageCount = extract.getPageCount();
for (int page = 1; page <= pageCount; page++) {
int imagesOnPageCount = extract.getImageCount(page);
for (int image = 0; image < imagesOnPageCount; image++) {
BufferedImage image = extract.getImage(page, image, true);
}
}
}
extract.closePDFfile();
This example uses the JPedal ExtractImages class. ExtractImages can output images in several different image formats including BMP, PNG, JPG, JPG2000 and TIFF.