Idiomatic Rust bindings for Pdfium

pdfium-render provides an idiomatic high-level Rust interface around the low-level bindings to Pdfium exposed by the excellent pdfium-sys crate.

``` // Renders each page in the given test PDF file to a separate JPEG file.

use pdfium_render::prelude::*;

// Bind to the system-provided Pdfium library.

let pdfium = Pdfium::new(Pdfium::bind_to_system_library().unwrap());

// Load a PDF file.

let document = pdfium.load_pdf_from_file("test.pdf", None).unwrap();

// Set our desired bitmap rendering options.

let bitmap_render_config = PdfBitmapConfig::new()
    .set_target_width(2000)
    .set_maximum_height(2000)
    .rotate_if_landscape(PdfBitmapRotation::Degrees90, true);

// Render each page to a bitmap image, then export each image to a JPEG file.

document.pages().iter().for_each(|page| {
    page.get_bitmap_with_config(&bitmap_render_config).unwrap()
        .as_image() // Renders this page to an Image::DynamicImage
        .as_rgba8().unwrap()
        .save_with_format(
          format!("test-page-{}.jpg", page.index()),
          image::ImageFormat::Jpeg
        ).unwrap();
});

```

In addition to providing a more natural interface to Pdfium, pdfium-render differs from pdfium-sys in several other important ways:

Examples demonstrating page rendering, text extraction, page object introspection, and compiling to WASM are available at https://github.com/ajrcarey/pdfium-render/tree/master/examples.

What's new

Version 0.5.6 adds the pdfium_render::prelude, adds bindings to Pdfium's FPDFAnnot_*() and FPDFPage_*Annot*() functions, and adds the PdfPageAnnotations collection and PdfPageAnnotation enum to the pdfium-render high-level interface. Not all annotation-related functionality is currently available through the high-level interface; this will be added gradually over time.

Porting existing Pdfium code from other languages

The high-level idiomatic Rust interface provided by the Pdfium struct is entirely optional; the Pdfium struct wraps around raw FFI bindings defined in the PdfiumLibraryBindings trait, and it is completely feasible to simply use the FFI bindings directly instead of the high level interface. This makes porting existing code that calls FPDF_* functions trivial, while still gaining the benefits of late binding and WASM compatibility. For instance, the following code snippet (taken from a C++ sample):

``` string test_doc = "test.pdf";

FPDF_InitLibrary();
FPDF_DOCUMENT doc = FPDF_LoadDocument(test_doc, NULL);
// ... do something with doc
FPDF_CloseDocument(doc);
FPDF_DestroyLibrary();

```

would translate to the following Rust code:

``` let bindings = Pdfium::bindtosystem_library().unwrap();

let test_doc = "test.pdf";

bindings.FPDF_InitLibrary();
let doc = bindings.FPDF_LoadDocument(test_doc, None);
// ... do something with doc
bindings.FPDF_CloseDocument(doc);
bindings.FPDF_DestroyLibrary();

```

Pdfium's API uses three different string types: classic C-style null-terminated char arrays, UTF-8 byte arrays, and a UTF-16LE byte array type named FPDF_WIDESTRING. For functions that take a C-style string or a UTF-8 byte array, pdfium-render's binding will take the standard Rust &str type. For functions that take an FPDF_WIDESTRING, pdfium-render exposes two functions: the vanilla FPDF_*() function that takes an FPDF_WIDESTRING, and an additional FPDF_*_str() helper function that takes a standard Rust &str and converts it internally to an FPDF_WIDESTRING before calling Pdfium. Examples of functions with additional _str() helpers include FPDFBookmark_Find(), FPDFAnnot_SetStringValue(), and FPDFText_SetText().

The PdfiumLibraryBindings::get_pdfium_utf16le_bytes_from_str() and PdfiumLibraryBindings::get_string_from_pdfium_utf16le_bytes() functions are provided for converting to and from UTF-16LE in your own code.

Note that the FPDF_LoadDocument() function is not available when compiling to WASM. Either embed the target PDF document directly using Rust's include_bytes!() macro, or use Javascript's fetch() API to retrieve the bytes of the target document over the network, then load those bytes into Pdfium using the FPDF_LoadMemDocument() function.

External Pdfium builds

pdfium-render does not include Pdfium itself. You can either bind to a system-provided library or package an external build of Pdfium alongside your Rust application. When compiling to WASM, packaging an external build of Pdfium as a separate WASM module is essential.

Compiling to WASM

See https://github.com/ajrcarey/pdfium-render/tree/master/examples for a full example that shows how to bundle a Rust application using pdfium-render alongside a pre-built Pdfium WASM module for inspection and rendering of PDF files in a web browser.

Development status

The initial focus of this crate has been on rendering pages in a PDF file; consequently, FPDF_* functions related to bitmaps and rendering have been prioritised. By 1.0, the functionality of all FPDF_* functions exported by all Pdfium modules will be available, with the exception of certain functions specific to interactive scripting, user interaction, and printing.

There are 368 FPDF_* functions in the Pdfium API. As of version 0.5.6, 187 (51%) have bindings available in pdfium-render, with the functionality of roughly two-thirds of these available via the high-level interface.

If you need a binding to a Pdfium function that is not currently available, just raise an issue.

Version history