Extract Text from PDF using C#

Table of contents

  1. Introduction
  2. Prerequisites
  3. Code Example
  4. Configuration Options
  5. Upload by URL
  6. Using Authentication
  7. Further details


The following tutorial shows you how to extract text from PDFs using a hosted JPedal cloud API, such as:

Whilst all the above services can be accessed with plain old HTTP requests, this tutorial uses our open source C# IDRCloudClient which provides a simple C# wrapper around the REST API.


Using nuget, install the idrsolutions-csharp-client package with the following command:

nuget install idrsolutions-csharp-client

Code Example

Here is a basic code example to extract text from PDFs. Configuration options and advanced features can be found below.

using System;
using System.Collections.Generic;
using idrsolutions-csharp-client;

class ExampleUsage
    static void Main(string[] args)
        var client = new IDRCloudClient("" + IDRCloudClient.JPEDAL);

            Dictionary<string, string> parameters = new Dictionary<string, string>
                //["token"] = "Token", //Required only when connecting to the IDRsolutions trial and cloud subscription service
                ["input"] = IDRCloudClient.UPLOAD,
                ["file"] = "path/to/input.pdf",
                ["settings"] = "{\"mode\":\"extractText\",\"type\":\"plainText\"}"

            Dictionary<string, string> results = client.Convert(parameters);

            String outputUrl = results.GetValueOrDefault("downloadUrl", "No download URL provided");
            client.DownloadResult(results, "path/to/output/dir");

            Console.WriteLine("Converted: " + outputUrl);
        catch (Exception e)
            Console.WriteLine("Download URL: " + e.Message);

Configuration Options

The JPedal API accepts a stringified JSON object containing key value pair configuration options to customise your extraction. The settings should be added to the parameters array. A full list of the configuration options to extract text from PDFs can be found here.

["settings"] = "{\"key\":\"value\",\"key\":\"value\"}"

Upload by URL

As well as uploading a local file you can also provide a URL which the JPedal Microservice will download and then perform the extraction. To do this you should replace the input and file values in the parameters variable with the following.

["input"] = IDRCloudClient.DOWNLOAD
["url"] = "http://exampleURL/exampleFile.pdf"

Using Authentication

If you have deployed your own JPedal Microservice that requires a username and password to extract text from PDFs, you will need to provide them with each conversion. These are provided by passing two variables named username and password to the convert method as shown below.

var client = new IDRCloudClient("" + IDRCloudClient.JPEDAL, "username", "password");

Further details

IDRCloudClient on GitHub
IDRCloudClient on Nuget
JPedal Microservice API
JPedal Microservice Use

Have more questions? Ask us on Discord