Link
Skip to main content

Extract Clipped Images from PDF

JPedal provides several methods to allow easy extraction of clipped images that appear with PDF files. Clipped images are the images from the PDF as they would appear on the page. PDFs can contain multiple images that can be clipped, rotated, and scaled for display on the page. This example returns the image with these applied, to get the raw images you can use ExtractImages

Extract Clipped Images from PDF with the Command-Line or another language

java -jar jpedal.jar --extractClippedImages inputFileOrFolder outputFolder outputFormat outputSize subFolder [outputSize subFolder]...

The final two variables for this example allow you to specify the size you want the images to be extracted at by specifying a height or a scale for the outputSize. If the outputSize is a number the images are resized to be the specified size or the size it appears on page if it is smaller. A value of -1 will output the image at the size it appears on page. If the outputSize is a x followed by a number (x1) the image is scaled by this factor. The subFolder is the subdirectory to save the previous size variable into. By repeating the outputSize and subFolder variables you can create multiple sizes from a single command.

java -jar jpedal.jar --extractClippedImages inputFileOrFolder outputFolder outputFormat -1 raw 100 medium 50 thumbnail x2 large

Extract Images from PDF in Java

Static Convenience Method

ExtractClippedImages.writeAllClippedImagesToDir(
        "inputFileOrFolder",
        "outputFolder",
        "outputImageFormat",
        new String[] {"imageHeightAsFloat", "subFolderForHeight"});

API Access Methods

ExtractClippedImages extract = new ExtractClippedImages("inputFile.pdf");
//extract.setPassword("password");
if (extract.openPDFFile()) {
    int pageCount = extract.getPageCount();
    for (int page = 1; page <= pageCount; page++) {
        
        int imagesOnPageCount = extract.getImageCount(page);
        for (int image = 0; image < imagesOnPageCount; image++) {
            BufferedImage image = extract.getClippedImage(page, image);
        }
    }
}

extract.closePDFfile();

This example uses the JPedal ExtractClippedImages class. The final variable is a String array that will specify a series of output sizes and sub-directories pairs in a similar manner to the outputSizes / subDirectory variables used on the command line.

ExtractClippedImages can output images in several different image formats including BMP, PNG, JPG and TIFF.


Why JPedal?

  • Actively developed commercial library with full support and no third party dependencies.
  • Process PDF files up to 3x faster than alternative Java PDF libraries.
  • Simple licensing options and source code access for OEM users.

Start Your Free Trial


Customer Downloads

Select Download