Crates.io | minimizer-iter |
lib.rs | minimizer-iter |
version | 1.2.1 |
source | src |
created_at | 2024-06-11 08:36:59.441062 |
updated_at | 2024-06-26 14:24:36.067113 |
description | Iterate over minimizers of a DNA sequence |
homepage | https://crates.io/crates/minimizer-iter |
repository | https://github.com/rust-seq/minimizer-iter |
max_upload_size | |
id | 1267964 |
size | 63,585 |
Iterate over minimizers of a DNA sequence.
If you'd like to use the underlying data structure manually, have a look at the minimizer-queue crate.
use minimizer_iter::MinimizerBuilder;
// Build an iterator over minimizers
// of size 21 with a window of size 11
// for the sequence "TGATTGCACAATC"
let min_iter = MinimizerBuilder::<u64>::new()
.minimizer_size(21)
.width(11)
.iter(b"TGATTGCACAATC");
for (minimizer, position) in min_iter {
// ...
}
If you'd like to use mod-minimizers instead, just change new()
to new_mod()
:
use minimizer_iter::MinimizerBuilder;
// Build an iterator over mod-minimizers
// of size 21 with a window of size 11
// for the sequence "TGATTGCACAATC"
let min_iter = MinimizerBuilder::<u64, _>::new_mod()
.minimizer_size(21)
.width(11)
.iter(b"TGATTGCACAATC");
for (minimizer, position) in min_iter {
// ...
}
Additionally, the iterator can produce canonical minimizers so that a sequence and its reverse complement will select the same minimizers.
To do so, just add .canonical()
to the builder:
MinimizerBuilder::<u64>::new()
.canonical()
.minimizer_size(...)
.width(...)
.iter(...)
If you need longer minimizers (> 32 bases), you can specify a bigger integer type such as u128
:
MinimizerBuilder::<u128>::new()
.minimizer_size(...)
.width(...)
.iter(...)
See the documentation for more details.
To run benchmarks against other implementations of minimizers, clone this repository and run:
cargo bench