Link

Extract Clipped Images from PDF

JPedal provides several methods to allow easy extraction of clipped images that appear with PDF files.

Extract Clipped Images from PDF with the Command-Line or another language

java -jar jpedal.jar --extractClippedImages 
inputFileOrDir outputDir outputFormat outputHeight subDirectory 
[outputHeight subDirectory]...

The final two variables for this example allow you specify the size you want the images to be extracted at (-1 to extract at images size) and a directory to store the images produced with that height value. These two variables can be repeated to allow you to produce multiple sizes. Below is an example that would produce png image output at original size in the directory outputDir/raw, 100px height in the directory outputDir/medium and 50px height in the directory outputDir/thumbnail.

java -jar jpedal.jar --extractClippedImages 
inputFileOrDir outputDir outputFormat -1 raw 100 medium 50 thumbnail

Extract Images from PDF in Java

Static Convenience Method

ExtractClippedImages.
writeAllClippedImagesToDir("inputFileOrDirectory", "outputDir" , 
"outpuImageFormat", new String[]{"imageHeightAsFloat", "subDirectoryForHeight"});

API Access Methods

ExtractClippedImages extract=new ExtractClippedImages("C:/pdfs/mypdf.pdf");
//extract.setPassword("password");
if (extract.openPDFFile()) {
    int pageCount=extract.getPageCount();
    for (int page=1; page<=pageCount; page++) {
        int imagesOnPageCount=extract.getImageCount(page);
        for (int image=0; image<imagesOnPageCount; image++) {
            BufferedImage image=extract.getClippedImage(page, image);
        }
    }
}

extract.closePDFfile();

This example uses the JPedal ExtractClippedImages class. The final variable is a String array that will specify a series of height and sub-directories pairs in a similar manner to the imageHeight / subDirectoryForHeight used on the command line.

ExtractClippedImages can output images in several different image formats including BMP, PNG, JPG and TIFF.