Idiomatic Rust bindings for Pdfium

pdfium-render provides an idiomatic high-level Rust interface around the low-level bindings to Pdfium exposed by the excellent pdfium-sys crate.

``` // Renders each page in the given test PDF file to a separate JPEG file.

use pdfium_render::prelude::*;

// Bind to the system-provided Pdfium library.

let pdfium = Pdfium::new(Pdfium::bind_to_system_library().unwrap());

// Load a PDF file with no password protection.

let document = pdfium.load_pdf_from_file("test.pdf", None).unwrap();

// Set bitmap rendering options that will apply to all pages.

let bitmap_render_config = PdfBitmapConfig::new()
    .set_target_width(2000)
    .set_maximum_height(2000)
    .rotate_if_landscape(PdfBitmapRotation::Degrees90, true);

// Render each page to a bitmap image, then export each image to a JPEG file.

document.pages().iter().for_each(|page| {
    page.get_bitmap_with_config(&bitmap_render_config).unwrap()
        .as_image() // Renders this page to an Image::DynamicImage
        .as_rgba8().unwrap()
        .save_with_format(
          format!("test-page-{}.jpg", page.index()),
          image::ImageFormat::Jpeg
        ).unwrap();
});

```

In addition to providing a more natural interface to Pdfium, pdfium-render differs from pdfium-sys in several other important ways:

Examples demonstrating page rendering, text extraction, page object introspection, and compiling to WASM are available at https://github.com/ajrcarey/pdfium-render/tree/master/examples.

What's new

Versions 0.5.8, 0.5.9, and 0.6.0 are bug fix releases.

Version 0.5.7 added the ability to bind to a build of Pdfium that has been statically linked into the final Rust executable.

Version 0.5.6 added the pdfium_render::prelude, added bindings to Pdfium's FPDFAnnot_*() and FPDFPage_*Annot*() functions, and added the PdfPageAnnotations collection and PdfPageAnnotation enum to the pdfium-render high-level interface. Not all annotation-related functionality is currently available through the high-level interface; this will be added gradually over time.

Porting existing Pdfium code from other languages

The high-level idiomatic Rust interface provided by the Pdfium struct is entirely optional; the Pdfium struct wraps around raw FFI bindings defined in the PdfiumLibraryBindings trait, and it is completely feasible to simply use the FFI bindings directly instead of the high level interface. This makes porting existing code that calls FPDF_* functions trivial, while still gaining the benefits of late binding and WASM compatibility. For instance, the following code snippet (taken from a C++ sample):

``` string test_doc = "test.pdf";

FPDF_InitLibrary();
FPDF_DOCUMENT doc = FPDF_LoadDocument(test_doc, NULL);
// ... do something with doc
FPDF_CloseDocument(doc);
FPDF_DestroyLibrary();

```

would translate to the following Rust code:

``` let bindings = Pdfium::bindtosystem_library().unwrap();

let test_doc = "test.pdf";

bindings.FPDF_InitLibrary();
let doc = bindings.FPDF_LoadDocument(test_doc, None);
// ... do something with doc
bindings.FPDF_CloseDocument(doc);
bindings.FPDF_DestroyLibrary();

```

Pdfium's API uses three different string types: classic C-style null-terminated char arrays, UTF-8 byte arrays, and a UTF-16LE byte array type named FPDF_WIDESTRING. For functions that take a C-style string or a UTF-8 byte array, pdfium-render's binding will take the standard Rust &str type. For functions that take an FPDF_WIDESTRING, pdfium-render exposes two functions: the vanilla FPDF_*() function that takes an FPDF_WIDESTRING, and an additional FPDF_*_str() helper function that takes a standard Rust &str and converts it internally to an FPDF_WIDESTRING before calling Pdfium. Examples of functions with additional _str() helpers include FPDFBookmark_Find(), FPDFAnnot_SetStringValue(), and FPDFText_SetText().

The PdfiumLibraryBindings::get_pdfium_utf16le_bytes_from_str() and PdfiumLibraryBindings::get_string_from_pdfium_utf16le_bytes() utility functions are provided for converting to and from FPDF_WIDESTRING in your own code.

Note that the FPDF_LoadDocument() function is not available when compiling to WASM. Either embed the target PDF document directly using Rust's include_bytes!() macro, or use Javascript's fetch() API to retrieve the bytes of the target document over the network, then load those bytes into Pdfium using the FPDF_LoadMemDocument() function.

Binding to Pdfium

pdfium-render does not include Pdfium itself. You have several options:

When compiling to WASM, packaging an external build of Pdfium as a separate WASM module is essential.

Dynamic linking

Binding to a dynamically-built Pdfium library is the simplest option. On Android, a system-provided libpdfium.so is packaged as part of the operating system; alternatively, you can package a pre-built dynamic library appropriate for your operating system alongside your Rust executable.

At the time of writing, the WASM builds at https://github.com/bblanchon/pdfium-binaries/releases are compiled with a non-growable WASM heap memory allocator. This means that attempting to open a PDF document longer than just a few pages will result in a unrecoverable out of memory error. The WASM builds at https://github.com/paulocoutinhox/pdfium-lib/releases are recommended as they do not have this problem.

Static linking

If you prefer link Pdfium directly into your executable at compile time, use the optional static crate feature. This enables the Pdfium::bind_to_statically_linked_library() function which binds directly to the Pdfium functions included in your executable:

``` use pdfium_render::prelude::*;

let pdfium = Pdfium::new(Pdfium::bind_to_statically_linked_library().unwrap());

```

As a convenience, pdfium-render can instruct cargo to link a statically-built Pdfium library for you. Set the path to the directory containing your pre-built library using the PDFIUM_STATIC_LIB_PATH environment variable when you run cargo build, like so:

PDFIUM_STATIC_LIB_PATH="/path/containing/your/static/pdfium/library" cargo build

pdfium-render will pass the following flags to cargo:

cargo:rustc-link-lib=static=pdfium cargo:rustc-link-search=native=$PDFIUM_STATIC_LIB_PATH

This saves you writing a custom build.rs yourself. If you have your own build pipeline that links Pdfium statically into your executable, simply leave the PDFIUM_STATIC_LIB_PATH environment variable unset.

Note that the path you set in PDFIUM_STATIC_LIB_PATH should not include the filename of the library itself; it should just be the path of the containing directory. You must make sure your statically-built library is named in the appropriate way for your target platform (libpdfium.a on Linux and macOS, for example) in order for the Rust compiler to locate it.

pdfium-render will not build Pdfium for you; you must build Pdfium yourself, or source a pre-built static archive from elsewhere.

Compiling to WASM

See https://github.com/ajrcarey/pdfium-render/tree/master/examples for a full example that shows how to bundle a Rust application using pdfium-render alongside a pre-built Pdfium WASM module for inspection and rendering of PDF files in a web browser.

Optional features

This crate provides the following optional features:

Neither feature is enabled by default.

Development status

The initial focus of this crate has been on rendering pages in a PDF file; consequently, FPDF_* functions related to bitmaps and rendering have been prioritised. By 1.0, the functionality of all FPDF_* functions exported by all Pdfium modules will be available, with the exception of certain functions specific to interactive scripting, user interaction, and printing.

There are 368 FPDF_* functions in the Pdfium API. As of version 0.6.0, 187 (51%) have bindings available in pdfium-render, with the functionality of roughly two-thirds of these available via the high-level interface.

If you need a binding to a Pdfium function that is not currently available, just raise an issue.

Version history