bindet (binary file type detection)

Crates Pipeline MIT License

Fast file type detection. Read more here: documentation

Supported file types

Example:

```rust use std::fs::{OpenOptions}; use std::io::BufReader; use std::io::ErrorKind; use bindet; use bindet::types::FileType; use bindet::FileTypeMatch; use bindet::FileTypeMatches;

fn example() { let file = OpenOptions::new().read(true).open("files/test.tar").unwrap(); let buf = BufReader::new(file);

let detect = bindet::detect(buf).map_err(|e| e.kind());
let expected: Result<Option<FileTypeMatches>, ErrorKind> = Ok(Some(FileTypeMatches::new(
    vec![FileType::Tar],
    vec![FileTypeMatch::new(FileType::Tar, true)]
)));

assert_eq!(detect, expected);

} ```

False Positives

Some file types magic numbers are composed of Human Readable Characters. For example, FLAC uses fLaC (0x66 0x4C 0x61 0x43) and PDF uses %PDF- (0x25 0x50 0x44 0x46 0x2D), because of this, text files that starts with this sequence can be detected as a binary file.

bindet reports those file types with FileTypeMatch::full_match = false, a second step can take these types and validate the prediction by applying a better specification match, however, at the moment, this only happens for Zip files.

You can use crates like encoding_rs to determine whether a file is really binary or text.