bytecmp

Crates.io	bytecmp
lib.rs	bytecmp
version	0.5.1
created_at	2024-03-19 16:23:25.002848+00
updated_at	2024-03-19 16:58:08.506449+00
description	bytecmp offers fast binary data comparison algorithms to enumerate common substrings, unique substrings or determine a patch set
homepage	https://github.com/yourlogarithm/bytecmp
repository	https://github.com/yourlogarithm/bytecmp
max_upload_size
id	1179423
size	47,000

Vlad-Sebastian Cretu (yourlogarithm)

documentation

https://docs.rs/bytecmp

README

bytecmp

bytecmp is a simple crate which offers data comparison mechanisms which go beyond the simple equality. It only operates on byte slices, hence its name, and relies on efficiently finding common substrings between two blob of data. The implementation relies on two different linear time algorithms: a HashMap based algorithm called HashMatch and a suffix tree built using Ukkonen algorithm called TreeMatch.

bytecmp is a fork of haxelion/bcmp.

Examples

Iterate over the matches between two strings using HashMatch with a minimum match length of 2 bytes:

extern crate bytecmp;

use bytecmp::{AlgoSpec, MatchIterator};

fn main() {
    let a = "abcdefg";
    let b = "012abc34cdef56efg78abcdefg";
    let match_iter = MatchIterator::new(a.as_bytes(), b.as_bytes(), AlgoSpec::HashMatch(2));
    for m in match_iter {
        println!("Match: {:}", &a[m.first_pos..m.first_end()]);
    }
}

Construct a patch set to build the file b from the file a using TreeMatch with a minimum match length of 4 bytes:

extern crate bytecmp;

use std::fs::File;
use std::io::Read;

use bytecmp::{AlgoSpec, patch_set};

fn main() {
    let mut a = Vec::<u8>::new();
    let mut b = Vec::<u8>::new();
    File::open("a").unwrap().read_to_end(&mut a);
    File::open("b").unwrap().read_to_end(&mut b);

    let ps = patch_set(&a, &b, AlgoSpec::TreeMatch(4));
    for patch in ps {
        println!("b[0x{:x}..0x{:x}] == a[0x{:x}..0x{:x}]", patch.second_pos, patch.second_end(), patch.first_pos, patch.first_end());
    }
}

Commit count: 27

bytecmp

documentation

README

bytecmp

Examples

cargo fmt