Link

Extract Images from PDF using NodeJS

Table of contents

  1. Introduction
  2. How to extract images from PDFs using NodeJS
  3. Complete Code Example
  4. Configuration Options
  5. Upload by URL
  6. Using Authentication
  7. Further details

Introduction

The following tutorial shows you how to extract images from PDFs using a hosted JPedal cloud API, such as:

Whilst all the above services can be accessed with plain old HTTP requests, this tutorial uses our open source NodeJS IDRCloudClient which provides a simple NodeJS wrapper around the REST API.

How to extract images from PDFs using NodeJS

  1. Using npm, install the idrcloudclient package with the following command:
    npm install --save @idrsolutions/idrcloudclient  
    
  2. Create an idrcloudclient object with
    var idrcloudclient = require('@idrsolutions/idrcloudclient');
    
  3. Create endpoint variable
    var endpoint = 'https://cloud.idrsolutions.com/cloud/' + idrcloudclient.JPEDAL;
    
  4. Create Parameters map to upload a file
    var parameters =  {
        //token: 'Token', //Required only when connecting to the IDRsolutions trial and cloud subscription service
        input: idrcloudclient.UPLOAD,
        file: 'path/to/exampleFile.pdf',
        settings: '{"mode":"extractImages","type":"rawImages","format":"png"}'
    }
    
  5. [Optional] Create listeners to trigger on progress, success, and failure.
    function progressListener(e) {
        console.log(JSON.stringify(e));
    }
    
    function failureListener(e) {
        console.log(e);
        console.log('Failed!');
    }
    
    function successListener(e) {
        console.log(JSON.stringify(e));
        console.log('Download URL: ' + e.downloadUrl);
    }
    
  6. Call convert method using variables created previously
    idrcloudclient.convert({
        endpoint: endpoint,
        parameters: parameters,
    
        // The below are the optional listeners, ignore any you haven't defined
        progress: progressListener,
        success: successListener,
        failure: failureListener
    });
    

Complete Code Example

Here is a complete code example to extract images from PDFs based on the above sections steps. Configuration options and advanced features can be found in the following sections.

var idrcloudclient = require('@idrsolutions/idrcloudclient');

function progressListener(e) {
    console.log(JSON.stringify(e));
}

function failureListener(e) {
    console.log(e);
    console.log('Failed!');
}

function successListener(e) {
    console.log(JSON.stringify(e));
    console.log('Download URL: ' + e.downloadUrl);
}

var endpoint = 'https://cloud.idrsolutions.com/cloud/' + idrcloudclient.JPEDAL;
var parameters =  {
    //token: 'Token', //Required only when connecting to the IDRsolutions trial and cloud subscription service
    input: idrcloudclient.UPLOAD,
    file: 'path/to/exampleFile.pdf',
    settings: '{"mode":"extractImages","type":"rawImages","format":"png"}'
 }

idrcloudclient.convert({
    endpoint: endpoint,
    parameters: parameters,
    
    // The below are the available listeners
    progress: progressListener,
    success: successListener,
    failure: failureListener
});

Configuration Options

The JPedal API accepts a stringified JSON object containing key value pair configuration options to customise your extraction. The settings should be added to the parameters array. A full list of the configuration options to extract images from PDFs can be found here.

settings: '{"key":"value","key":"value"}'

Upload by URL

As well as uploading a local file you can also provide a URL which the JPedal Microservice will download and then perform the extraction. To do this you should replace the input and file values in the parameters variable with the following.

input: IDRCloudClient.DOWNLOAD
url: 'http://exampleURL/exampleFile.pdf'

Using Authentication

If you have deployed your own JPedal Microservice that requires a username and password to extract images from PDFs, you will need to provide them with each conversion. These are provided by passing two variables named username and password to the convert method as shown below.

username: 'username',
password: 'password',

Further details

IDRCloudClient on GitHub
IDRCloudClient on NPM
JPedal Microservice API
JPedal Microservice Use

Still need help? Send us your questions.