unicode-bom

Build status Crate status Downloads License

Unicode byte-order mark detection for Rust projects.

What does it do?

unicode-bom will read the first few bytes from an array or a file on disk, then determine whether a byte-order mark is present.

What doesn't it do?

It won't check the rest of the data to determine whether it's actually valid according to the indicated encoding.

How do I install it?

Add it to your dependencies in Cargo.toml:

toml [dependencies] unicode-bom = "1"

How do I use it?

For more detailed information see the API docs, but the general gist is as follows:

```rust use unicode_bom::Bom;

// The BOM can be parsed from a file on disk via the FromStr trait... let bom: Bom = "foo.txt".parse().unwrap(); match bom { Bom::Null => { // No BOM was detected } Bom::Bocu1 => { // BOCU-1 BOM was detected } Bom::Gb18030 => { // GB 18030 BOM was detected } Bom::Scsu => { // SCSU BOM was detected } Bom::UtfEbcdic => { // UTF-EBCDIC BOM was detected } Bom::Utf1 => { // UTF-1 BOM was detected } Bom::Utf7 => { // UTF-7 BOM was detected } Bom::Utf8 => { // UTF-8 BOM was detected } Bom::Utf16Be => { // UTF-16 (big-endian) BOM was detected } Bom::Utf16Le => { // UTF-16 (little-endian) BOM was detected } Bom::Utf32Be => { // UTF-32 (big-endian) BOM was detected } Bom::Utf32Le => { // UTF-32 (little-endian) BOM was detected } }

// ...or you can detect the BOM in a byte array let bytes = [0u8, 0u8, 0xfeu8, 0xffu8]; assert_eq!(Bom::from(&bytes[0..]), Bom::Utf32Be); ```

How do I set up the build environment?

If you don't already have Rust installed, get that first using rustup:

curl https://sh.rustup.rs -sSf | sh

Then you can build the project:

cargo b

And run the tests:

cargo t

Is there API documentation?

Yes.

Is there a change log?

Yes.

What license is it published under?

Apache-2.0.