bindet

Crates.iobindet
lib.rsbindet
version0.3.2
sourcesrc
created_at2021-09-23 23:13:10.187699
updated_at2022-07-21 19:46:29.132497
descriptionFast file type detection
homepagehttps://gitlab.com/Kores/bindet
repositoryhttps://gitlab.com/Kores/bindet
max_upload_size
id455647
size11,368,373
Jonathan (JonathanxD)

documentation

https://docs.rs/bindet/

README

bindet (binary file type detection)

Crates Pipeline MIT License

Fast file type detection. Read more here: documentation

Supported file types

  • Zip
  • Rar (Rar 4 and 5)
  • Tar
  • Png
  • Jpg
  • 7-zip
  • Opus
  • Vorbis
  • Mp3
  • Webp
  • Flac
  • Matroska (mkv, mka, mks, mk3d, webm)
  • Wasm
  • Java Class
  • Mach-O
  • Elf (Executable and Linkable Format)
  • Wav
  • Avi
  • Aiff
  • Tiff
  • Sqlite3 (.db)
  • Ico
  • Dalvik
  • Pdf
  • Gif
  • Xcf
  • Scala Tasty
  • Bmp
  • others are on the road

Example:

use std::fs::{OpenOptions};
use std::io::BufReader;
use std::io::ErrorKind;
use bindet;
use bindet::types::FileType;
use bindet::FileTypeMatch;
use bindet::FileTypeMatches;

fn example() {
    let file = OpenOptions::new().read(true).open("files/test.tar").unwrap();
    let buf = BufReader::new(file);

    let detect = bindet::detect(buf).map_err(|e| e.kind());
    let expected: Result<Option<FileTypeMatches>, ErrorKind> = Ok(Some(FileTypeMatches::new(
        vec![FileType::Tar],
        vec![FileTypeMatch::new(FileType::Tar, true)]
    )));

    assert_eq!(detect, expected);
}

False Positives

Some file types magic numbers are composed of Human Readable Characters. For example, FLAC uses fLaC (0x66 0x4C 0x61 0x43) and PDF uses %PDF- (0x25 0x50 0x44 0x46 0x2D), because of this, text files that starts with this sequence can be detected as a binary file.

bindet reports those file types with FileTypeMatch::full_match = false, a second step can take these types and validate the prediction by applying a better specification match, however, at the moment, this only happens for Zip files.

You can use crates like encoding_rs to determine whether a file is really binary or text.

Commit count: 32

cargo fmt