High level Rust binding for Tesseract and Leptonica.
Low level C API bindings are auto generated using bindgen.
Make sure you have Leptonica and Tesseract installed.
For Ubuntu user:
bash
sudo apt-get install libleptonica-dev libtesseract-dev
You will also need to install tesseract language data based on your OCR needs:
bash
sudo apt-get install tesseract-ocr-eng
Minimal example:
```rust let mut api = tesseract::TessApi::new(None, "eng"); let mut pix = leptonica::pixread(Path::new("path/page.bmp")).unwrap(); api.setimage(&pix);
println!("{}", api.getutf8text().unwrap());
api.destroy(); pix.destroy(); ```
For more examples, see examples
directory.
Regenerate capi binding:
make gen
To run tests, you will need at Tesseract 4.x to match what we have in tests/tessdata/eng.traineddata
. See CircleCI config to see how to replicate the setup.