Link

Extract Images from PDF using Ruby

Table of contents

  1. Introduction
  2. Prerequisites
  3. Code Example
  4. Configuration Options
  5. Upload by URL
  6. Using Authentication
  7. Further details

Introduction

The following tutorial shows you how to extract images from PDFs using a hosted JPedal cloud API, such as:

Whilst all the above services can be accessed with plain old HTTP requests, this tutorial uses our open source Ruby IDRCloudClient which provides a simple Ruby wrapper around the REST API.

Prerequisites

There are two approaches to using the IDRCloudClient in your project.

Using gem, install the idr_cloud_client gem with the following command:

gem install idr_cloud_client

Alternatively, you can add the line “gem ‘idr_cloud_client’” to your applications gemfile then run the following command.

bundle install

Code Example

Here is a basic code example to extract images from PDFs. Configuration options and advanced features can be found below.

require 'idr_cloud_client'

client = IDRCloudClient.new('https://cloud.idrsolutions.com/cloud/' + IDRCloudClient::JPEDAL)

result = client.convert(
    # token='Token', # Required only when connecting to the IDRsolutions trial and cloud subscription service
    input: IDRCloudClient::UPLOAD, 
    file: 'path/to/exampleFile.pdf', 
    settings: '{"mode":"extractImages","type":"rawImages","format":"png"}')

client.download_result(result, 'path/to/output/dir')

puts 'Download URL: ' + result['downloadUrl']

Configuration Options

The JPedal API accepts a stringified JSON object containing key value pair configuration options to customise your extraction. The settings should be provided to the convert method. A full list of the configuration options to extract images from PDFs can be found here.

settings:'{"key":"value","key":"value"}'

Upload by URL

As well as uploading a local file you can also provide a URL which the JPedal Microservice will download and then perform the extraction. To do this you should replace the input and file values in the convert method with the following.

input:IDRCloudClient.DOWNLOAD
url:'http://exampleURL/exampleFile.pdf'

Using Authentication

If you have deployed your own JPedal Microservice that requires a username and password to extract images from PDFs, you will need to provide them with each conversion. These are provided by passing a variable named auth to the convert method as shown below.

auth:('username', 'password'))

Further details

IDRCloudClient on GitHub
IDRCloudClient on RUBY
JPedal Microservice API
JPedal Microservice Use

Still need help? Send us your questions.