Fast file type detection. Read more here: documentation
.db
)```rust use std::fs::{OpenOptions}; use std::io::BufReader; use std::io::ErrorKind; use bindet; use bindet::types::FileType; use bindet::FileTypeMatch; use bindet::FileTypeMatches;
fn example() { let file = OpenOptions::new().read(true).open("files/test.tar").unwrap(); let buf = BufReader::new(file);
let detect = bindet::detect(buf).map_err(|e| e.kind());
let expected: Result<Option<FileTypeMatches>, ErrorKind> = Ok(Some(FileTypeMatches::new(
vec![FileType::Tar],
vec![FileTypeMatch::new(FileType::Tar, true)]
)));
assert_eq!(detect, expected);
} ```
Some file types magic numbers are composed of Human Readable Characters. For example, FLAC uses fLaC
(0x66 0x4C 0x61 0x43
)
and PDF uses %PDF-
(0x25 0x50 0x44 0x46 0x2D
), because of this, text files that starts with this sequence can be detected as a binary file.
bindet reports those file types with FileTypeMatch::full_match = false
, a second step can take these types and validate
the prediction by applying a better specification match, however, at the moment, this only happens for Zip
files.
You can use crates like encoding_rs to determine whether a file is really binary or text.