Link
Skip to main content

How to alter the file encoding for Search and Extraction?

Java is capable of writing out text in different encodings. In some cases this can mean the text you see on the page might not match that seen once extracted. The most common cause is the content is read as one encoding but somewhere else it is treated as another.

One common result of this are characters not being recognised and being returned as ????.

If you find yourself using the search or extraction feature we recommend setting the following VM argument.

-Dfile.encoding=UTF-8

As of Java 18 this flag is not required as it has been set by default.

A list of supported encodings can be found here.


Get started with JPedal in 3 steps

  1. Fill in the form to download the trial jar →
  2. Copy the code snippets as instructed on the next page
  3. Build your solution using our docs

Learn more about JPedal

Start Your Free Trial


Customer Downloads

Select Download