Build Status

Attempts to detect the character encoding of raw text using the uchardet library.

To add it to your project, add the following lines to your Cargo.toml file:

[dependencies.uchardet] git = "git://github.com/emk/rust-uchardet"

To run it:

```rust // At the top of the file. extern crate uchardet; use uchardet::EncodingDetector;

// Inside a function. asserteq!(Some("UTF-8".tostring()), EncodingDetector::detect("français".as_bytes()).unwrap()); ```

API documentation is available.

Are you looking for a Rust wrapper for cld2 for detecting languages? I'm currently working on one and hope to publish it shortly.

Getting uchardet (usually optional)

If you wish, you may install uchardet using your system package manager. For example, under Ubuntu, you can run:

sh sudo apt-get install libuchardet-dev

If you skip this step, Cargo will attempt to compile uchardet from the bundled source code instead. This will probably only work on Linux machines with CMake involved, but pull requests to improve this are welcomed eagerly.

License

New code in the rust-uchardet library is released into the public domain, as described in the UNLICENSE file. However, several pre-existing pieces have their own licenses: