strobemers

Crates.iostrobemers
lib.rsstrobemers
version0.1.1
created_at2025-07-02 18:43:31.134246+00
updated_at2025-11-15 16:31:30.846882+00
descriptionA toolkit for generating strobemers
homepage
repositoryhttps://codeberg.org/mttmartin/strobemers
max_upload_size
id1735460
size24,984
Matthew Martin (mttmartin)

documentation

README

Strobemers

A Rust crate to generate strobemers. Strobemers are a type of fuzzy seed originally designed for bioinformatics use-cases to perform well with substitutions and especially insertions/deletions. For more information see the paper: Kristoffer Sahlin, Effective sequence similarity detection with strobemers, Genome Res. November 2021 31: 2080-2094.

This crate aims to provide a toolkit for reproducing existing strobemer implementations while allowing individual components to be easily swapped out (e.g. hash function, window generator, or strobe selector). The StrobeHasher implements the hash function used when generating strobemers, the WindowGenerator creates the windows strobes are selected in, and StrobeSelector actually selects the strobe within their windows.

Currently the only supported pre-made implementations are intended to generate identical strobemers as the original C++ implementation here. The randstrobe is RandstrobeSahlin2021 and minstrobe is MinstrobeSahlin2021.

Example using RandstrobeSahlin2021

use strobemers::StrobemerBuilder;
use strobemers::implementations::RandstrobeSahlin2021;
let reference = b"ACGCGTACGAATCACGCCGGGTGTGTGTGATCG";
let n: usize = 2;
let k: usize = 15;
let w_min: usize = 16;
let w_max: usize = 30;

let mut randstrobe_iter = StrobemerBuilder::from_implementation(RandstrobeSahlin2021)
    .reference(reference)
    .n(n)
    .k(k)
    .w_min(w_min)
    .build()
    .unwrap();
for strobe in randstrobe_iter {
    println!("randstrobe start positions: {:?}", strobe);
}

Example starting with RandstrobeSahlin2021 and replacing the hash function

use strobemers::StrobemerBuilder;
use strobemers::implementations::{RandstrobeSahlin2021, StrobeHasher};
use wyhash::wyhash;

let reference = b"ACGCGTACGAATCACGCCGGGTGTGTGTGATCG";
let n: usize = 2;
let k: usize = 15;
let w_min: usize = 16;
let w_max: usize = 30;

struct WyHasher;
impl StrobeHasher for WyHasher {
    fn hash(&self, input: &[u8], k: usize) -> Vec<u64> {
        let mut input_hashes = Vec::new();
        for i in 0..input.len() - k {
            input_hashes.push(wyhash(&input[i..i + k], 42));
        }
        input_hashes
    }
}

let mut randstrobe_iter = StrobemerBuilder::from_implementation(RandstrobeSahlin2021)
    .reference(reference)
    .n(n)
    .k(k)
    .w_min(w_min)
    .hasher(Box::new(WyHasher))
    .build()
    .unwrap();
for strobe in randstrobe_iter {
    println!("randstrobe start positions: {:?}", strobe);
}
Commit count: 0

cargo fmt