Crates.io | bindet |
lib.rs | bindet |
version | 0.3.2 |
source | src |
created_at | 2021-09-23 23:13:10.187699 |
updated_at | 2022-07-21 19:46:29.132497 |
description | Fast file type detection |
homepage | https://gitlab.com/Kores/bindet |
repository | https://gitlab.com/Kores/bindet |
max_upload_size | |
id | 455647 |
size | 11,368,373 |
Fast file type detection. Read more here: documentation
.db
)use std::fs::{OpenOptions};
use std::io::BufReader;
use std::io::ErrorKind;
use bindet;
use bindet::types::FileType;
use bindet::FileTypeMatch;
use bindet::FileTypeMatches;
fn example() {
let file = OpenOptions::new().read(true).open("files/test.tar").unwrap();
let buf = BufReader::new(file);
let detect = bindet::detect(buf).map_err(|e| e.kind());
let expected: Result<Option<FileTypeMatches>, ErrorKind> = Ok(Some(FileTypeMatches::new(
vec![FileType::Tar],
vec![FileTypeMatch::new(FileType::Tar, true)]
)));
assert_eq!(detect, expected);
}
Some file types magic numbers are composed of Human Readable Characters. For example, FLAC uses fLaC
(0x66 0x4C 0x61 0x43
)
and PDF uses %PDF-
(0x25 0x50 0x44 0x46 0x2D
), because of this, text files that starts with this sequence can be detected as a binary file.
bindet reports those file types with FileTypeMatch::full_match = false
, a second step can take these types and validate
the prediction by applying a better specification match, however, at the moment, this only happens for Zip
files.
You can use crates like encoding_rs to determine whether a file is really binary or text.