pdfium-render
provides an idiomatic high-level Rust interface around the low-level bindings to
Pdfium exposed by the excellent pdfium-sys
crate.
``` // Renders each page in the given test PDF file to a separate JPEG file.
use pdfium_render::prelude::*;
// Bind to the system-provided Pdfium library.
let pdfium = Pdfium::new(Pdfium::bind_to_system_library().unwrap());
// Load a PDF file.
let document = pdfium.load_pdf_from_file("test.pdf", None).unwrap();
// Set our desired bitmap rendering options.
let bitmap_render_config = PdfBitmapConfig::new()
.set_target_width(2000)
.set_maximum_height(2000)
.rotate_if_landscape(PdfBitmapRotation::Degrees90, true);
// Render each page to a bitmap image, then export each image to a JPEG file.
document.pages().iter().for_each(|page| {
page.get_bitmap_with_config(&bitmap_render_config).unwrap()
.as_image() // Renders this page to an Image::DynamicImage
.as_rgba8().unwrap()
.save_with_format(
format!("test-page-{}.jpg", page.index()),
image::ImageFormat::Jpeg
).unwrap();
});
```
In addition to providing a more natural interface to Pdfium, pdfium-render
differs from
pdfium-sys
in several other important ways:
pdfium-render
uses libloading
to late bind to a Pdfium library at run-time, whereas
pdfium-sys
binds to a library at compile-time. By binding to Pdfium at run-time instead
of compile-time, pdfium-render
can dynamically switch between bundled libraries and
system libraries and provide idiomatic Rust error handling in situations where a Pdfium
library is not available.pdfium-render
can be compiled to WASM for running in a
browser; this is not possible with pdfium-sys
.pdfium-sys
only provides bindings for the subset of functions exposed
by Pdfium's view module; pdfium-render
aims to ultimately provide bindings to all non-interactive
functions exposed by all Pdfium modules, including document creation and editing functions.
This is a work in progress. Image::DynamicImage
for easy,
idiomatic post-processing.Examples demonstrating page rendering, text extraction, page object introspection, and compiling to WASM are available at https://github.com/ajrcarey/pdfium-render/tree/master/examples.
Version 0.5.6 adds the pdfium_render::prelude
, adds bindings to Pdfium's FPDFAnnot_*()
and FPDFPage_*Annot*()
functions, and adds the PdfPageAnnotations
collection and
PdfPageAnnotation
enum to the pdfium-render
high-level interface. Not all annotation-related
functionality is currently available through the high-level interface; this will be added
gradually over time.
The high-level idiomatic Rust interface provided by the Pdfium
struct is entirely optional;
the Pdfium
struct wraps around raw FFI bindings defined in the PdfiumLibraryBindings
trait, and it is completely feasible to simply use the FFI bindings directly
instead of the high level interface. This makes porting existing code that calls FPDF_* functions
trivial, while still gaining the benefits of late binding and WASM compatibility.
For instance, the following code snippet (taken from a C++ sample):
``` string test_doc = "test.pdf";
FPDF_InitLibrary();
FPDF_DOCUMENT doc = FPDF_LoadDocument(test_doc, NULL);
// ... do something with doc
FPDF_CloseDocument(doc);
FPDF_DestroyLibrary();
```
would translate to the following Rust code:
``` let bindings = Pdfium::bindtosystem_library().unwrap();
let test_doc = "test.pdf";
bindings.FPDF_InitLibrary();
let doc = bindings.FPDF_LoadDocument(test_doc, None);
// ... do something with doc
bindings.FPDF_CloseDocument(doc);
bindings.FPDF_DestroyLibrary();
```
Pdfium's API uses three different string types: classic C-style null-terminated char arrays,
UTF-8 byte arrays, and a UTF-16LE byte array type named FPDF_WIDESTRING
. For functions that take a
C-style string or a UTF-8 byte array, pdfium-render
's binding will take the standard Rust &str
type.
For functions that take an FPDF_WIDESTRING
, pdfium-render
exposes two functions: the vanilla
FPDF_*()
function that takes an FPDF_WIDESTRING
, and an additional FPDF_*_str()
helper function
that takes a standard Rust &str
and converts it internally to an FPDF_WIDESTRING
before calling
Pdfium. Examples of functions with additional _str()
helpers include FPDFBookmark_Find()
,
FPDFAnnot_SetStringValue()
, and FPDFText_SetText()
.
The PdfiumLibraryBindings::get_pdfium_utf16le_bytes_from_str()
and
PdfiumLibraryBindings::get_string_from_pdfium_utf16le_bytes()
functions are provided
for converting to and from UTF-16LE in your own code.
Note that the FPDF_LoadDocument()
function is not available when compiling to WASM.
Either embed the target PDF document directly using Rust's include_bytes!()
macro, or use Javascript's fetch()
API to retrieve the bytes of the target document over
the network, then load those bytes into Pdfium using the FPDF_LoadMemDocument()
function.
pdfium-render
does not include Pdfium itself. You can either bind to a system-provided library
or package an external build of Pdfium alongside your Rust application. When compiling to WASM,
packaging an external build of Pdfium as a separate WASM module is essential.
pdfium-render
: https://github.com/paulo-coutinho/pdfium-lib/releasesSee https://github.com/ajrcarey/pdfium-render/tree/master/examples for a full example that shows
how to bundle a Rust application using pdfium-render
alongside a pre-built Pdfium WASM module for
inspection and rendering of PDF files in a web browser.
The initial focus of this crate has been on rendering pages in a PDF file; consequently, FPDF_*
functions related to bitmaps and rendering have been prioritised. By 1.0, the functionality of all
FPDF_*
functions exported by all Pdfium modules will be available, with the exception of certain
functions specific to interactive scripting, user interaction, and printing.
pdfium-render
.pdfium-render
.pdfium-render
.There are 368 FPDF_*
functions in the Pdfium API. As of version 0.5.6, 187 (51%) have
bindings available in pdfium-render
, with the functionality of roughly two-thirds of these
available via the high-level interface.
If you need a binding to a Pdfium function that is not currently available, just raise an issue.
pdfium_render::prelude
, adds bindings for FPDFAnnot_*()
and FPDFPage_*Annot*()
functions, adds PdfPageAnnotations
collection and PdfPageAnnotation
struct
to the high-level interface.PdfBitmapConfig::set_reverse_byte_order()
to true
to
switch from Pdfium's default BGRA8 pixel format to RGBA8. This is necessary since
the image
crate dropped support for BGRA8 in version 0.24. See
https://github.com/ajrcarey/pdfium-render/issues/9 for more information.FPDFBookmark_*()
, FPDFPageObj_*()
, FPDFText_*()
, and
FPDFFont_*()
functions, adds PdfPageObjects
, PdfPageText
, and PdfBookmarks
collections
to the high-level interface.FPDF_GetPageBoundingBox()
, FPDFDoc_GetPageMode()
,
FPDFPage_Get*Box()
, and FPDFPage_Set*Box()
functions, adds PdfPageBoundaries
collection
to the high-level interface.FPDFPage_GetRotation()
and FPDFPage_SetRotation()
functions,
adds PdfMetadata
collection to the high-level interface.PdfBitmapConfig
implementation