Skip the optional encoding BOM at the start of an I/O stream if it exists.
The SkipEncodingBom
data structure does not make any dynamic allocations and supports progressive stream reads.
As of now, only the UTF-8 BOM is supported.
```rust use skip_bom::{BomType, SkipEncodingBom}; use std::io::{Cursor, Read};
// Read a stream after checking that it starts with the BOM const BOMBYTES: &'static [u8] = b"\xEF\xBB\xBFThis stream starts with a UTF-8 BOM."; let mut reader = SkipEncodingBom::new(Cursor::new(BOMBYTES)); asserteq!(Some(BomType::UTF8), reader.readbom().unwrap()); let mut string = Default::default(); let _ = reader.readtostring(&mut string).unwrap(); assert_eq!("This stream starts with a UTF-8 BOM.", &string);
// Read a stream without a starting BOM const NOBOMBYTES: &'static [u8] = b"This stream does not start with the UTF-8 BOM: \xEF\xBB\xBF."; let mut reader = SkipEncodingBom::new(Cursor::new(NOBOMBYTES)); asserteq!(None, reader.readbom().unwrap()); let mut buf = Default::default(); let _ = reader.readtoend(&mut buf).unwrap(); asserteq!(b"This stream does not start with the UTF-8 BOM: \xEF\xBB\xBF.", buf.asslice());
// Read a stream and disregard the starting BOM completely let mut reader = SkipEncodingBom::new(Cursor::new(BOMBYTES)); let mut buf = Default::default(); let _ = reader.readtoend(&mut buf).unwrap(); asserteq!(b"This stream starts with a UTF-8 BOM.", buf.asslice()); // Check the BOM after the read is over. asserteq!(Some(Some(BomType::UTF8)), reader.bom_found()); ```
This crate supports I/O streams that are incomplete at first and receive data later, even for the initial BOM. Example:
```rust use skip_bom::{BomType, SkipEncodingBom}; use std::io::{Cursor, Read};
let mut reader = SkipEncodingBom::new(Cursor::new(b"\xEF\xBB".tovec())); let mut buf = Default::default(); let _ = reader.readtoend(&mut buf).unwrap(); // The stream is incomplete: there are only the first two bytes of the BOM yet asserteq!(0, buf.len(), "{:?}", buf.asslice()); asserteq!(None, reader.bomfound()); // Add the next bytes and check that the UTF-8 BOM is accounted for reader.getmut().getmut().extendfromslice(b"\xBFThis stream has a BOM."); let _ = reader.readtoend(&mut buf).unwrap(); asserteq!(b"This stream has a BOM.", buf.asslice()); asserteq!(Some(BomType::UTF8), reader.bom_found().unwrap()); ```
Module documentation with examples.
This project is licensed under either of
at your option.