| Crates.io | minimizer-iter |
| lib.rs | minimizer-iter |
| version | 1.2.1 |
| created_at | 2024-06-11 08:36:59.441062+00 |
| updated_at | 2024-06-26 14:24:36.067113+00 |
| description | Iterate over minimizers of a DNA sequence |
| homepage | https://crates.io/crates/minimizer-iter |
| repository | https://github.com/rust-seq/minimizer-iter |
| max_upload_size | |
| id | 1267964 |
| size | 63,585 |
Iterate over minimizers of a DNA sequence.
If you'd like to use the underlying data structure manually, have a look at the minimizer-queue crate.
use minimizer_iter::MinimizerBuilder;
// Build an iterator over minimizers
// of size 21 with a window of size 11
// for the sequence "TGATTGCACAATC"
let min_iter = MinimizerBuilder::<u64>::new()
.minimizer_size(21)
.width(11)
.iter(b"TGATTGCACAATC");
for (minimizer, position) in min_iter {
// ...
}
If you'd like to use mod-minimizers instead, just change new() to new_mod():
use minimizer_iter::MinimizerBuilder;
// Build an iterator over mod-minimizers
// of size 21 with a window of size 11
// for the sequence "TGATTGCACAATC"
let min_iter = MinimizerBuilder::<u64, _>::new_mod()
.minimizer_size(21)
.width(11)
.iter(b"TGATTGCACAATC");
for (minimizer, position) in min_iter {
// ...
}
Additionally, the iterator can produce canonical minimizers so that a sequence and its reverse complement will select the same minimizers.
To do so, just add .canonical() to the builder:
MinimizerBuilder::<u64>::new()
.canonical()
.minimizer_size(...)
.width(...)
.iter(...)
If you need longer minimizers (> 32 bases), you can specify a bigger integer type such as u128:
MinimizerBuilder::<u128>::new()
.minimizer_size(...)
.width(...)
.iter(...)
See the documentation for more details.
To run benchmarks against other implementations of minimizers, clone this repository and run:
cargo bench