Link

Converting Office Documents

Although BuildVu can only convert PDF files to HTML or SVG, it is also possible to convert Office documents such as Word, PowerPoint and Excel files by utilizing LibreOffice to pre-convert the office documents to PDF.

You will need LibreOffice installed in order to do this. If you are running LibreOffice on Linux you may find that some Office documents do not convert correctly if they use fonts that are not available on Linux. We recommend installing Google Noto Fonts to increase the likelihood that missing fonts will be substituted with a fallback.

BuildVu can convert office documents using LibreOffice via the command line or from a Java class.

Command Line

Once LibreOffice is installed, simply pass in the absolute path to the LibreOffice executable as a system property to enable conversion of Office documents from the command line.

java -Dorg.jpedal.pdf2html.libreOfficeExecutablePath="/path/to/soffice" -jar buildvu.jar /inputDir/ /outputDir/

Conversion will fail if a PDF file with the same filename as the office document already exists. This can be overridden by adding the following system property: -Dorg.jpedal.pdf2html.allowLibreOfficeOverwrite=true

Java Class

Once LibreOffice is installed, the DocumentToPDFConverter class can be used to invoke LibreOffice to preprocess Office documents so that you don’t have to.

if (DocumentToPDFConverter.hasConvertibleFileType(document)) {
    try {
        DocumentToPDFConverter.convert(document, libreOfficeExecutablePath);

        // Code to convert generated PDF file to HTML5 goes here

    } catch (IOException e) {
        // Problem occurred - see Javadoc for reasons*
    } catch (InterruptedException e) {
        // Process was interrupted
    }
} else {
    // File type not supported
}

It is recommended to read the Javadoc. The source code for the DocumentToPDFConverter class is also available to view online.