Crates.io | minimap2-temp |
lib.rs | minimap2-temp |
version | 0.1.32+minimap2.2.28 |
source | src |
created_at | 2024-08-26 18:03:55.62985 |
updated_at | 2024-11-13 18:07:29.046586 |
description | Bindings to libminimap2 |
homepage | |
repository | https://github.com/rob-p/minimap2-rs |
max_upload_size | |
id | 1352483 |
size | 166,598 |
A rust FFI library for minimap2. In development! Feedback appreciated!
minimap2-sys is the library of the raw FFI bindings to minimap2. minimap2 is the more rusty version.
minimap2 = "0.1.20+minimap2.2.28"
Also see Features
Tested with rustc 1.64.0 and nightly. So probably a good idea to upgrade before running. But let me know if you run into pain points with older versions and will try to fix!
minimap2-rs | minimap2 |
---|---|
0.1.20 | 2.28 |
0.1.19 | 2.28 |
0.1.18 | 2.28 |
0.1.17 | 2.27 |
0.1.16 | 2.26 |
Create an Aligner
let mut aligner = Aligner::builder()
.map_ont()
.with_threads(8)
.with_cigar()
.with_index("ReferenceFile.fasta", None)
.expect("Unable to build index");
Align a sequence:
let seq: Vec<u8> = b"ACTGACTCACATCGACTACGACTACTAGACACTAGACTATCGACTACTGACATCGA";
let alignment = aligner
.map(&seq, false, false, None, None)
.expect("Unable to align");
All minimap2 presets should be available (see functions section):
let aligner = map_ont();
let aligner = asm20();
MapOpts and IdxOpts can be customized with Rust's struct pattern, as well as applying mapping settings. Inspired by bevy.
Aligner {
mapopt: MapOpt {
seed: 42,
best_n: 1,
..Default::default()
},
idxopt: IdxOpt {
k: 21,
..Default::default()
},
..map_ont()
}
There is a binary called "fakeminimap2" that I am using to test for memory leaks. You can follow the source code for an example. It also shows some helper functions for identifying compression types and FASTA vs FASTQ files. I used my own parsers as they are well fuzzed, but open to removing them or putting them behind a feature wall.
Alignment functions return a Mapping struct. The Alignment struct is only returned when the Aligner is created using .with_cigar().
A very simple example would be:
let mut file = std::fs::File::open(query_file);
let mut reader = BufReader::new(reader);
let mut fasta = Fasta::from_buffer(&mut reader)
for seq in reader {
let seq = seq.unwrap();
let alignment: Vec<Mapping> = aligner
.map(&seq.sequence.unwrap(), false, false, None, None)
.expect("Unable to align");
println!("{:?}", alignment);
}
There is a map_file function that works on an entire file, but it is not-lazy and thus not suitable for large files. It may be removed in the future or moved to a separate lib.
let mappings: Result<Vec<Mapping>> = aligner.map_file("query.fa", false, false);
Multithreading is supported, for implementation example see fakeminimap2. Minimap2 also supports threading itself, and will use a minimum of 3 cores for building the index. Multithreading for mapping is left to the end-user.
let mut aligner = Aligner::builder()
.map_ont()
.with_index_threads(8);
This appears to work.
use rayon::prelude::*;
let results = sequences.par_iter().map(|seq| {
aligner.map(seq.as_bytes(), false, false, None, None).unwrap()
}).collect::<Vec<_>>();
The following crate features are available:
mm2-fast
- Replace minimap2 with mm2-fast. This is likely not portable.htslib
- Support output of bam/sam files using htslib.simde
- Compile minimap2 / mm2-fast with simd-everywhere support.map-file
- Default - Convenience function for mapping an entire file. Caution, this is single-threaded.sse2only
- Compiles for SSE2 support only (Default is to try to compile for SSE4.1, SSE2 only is default on aarch64)Map-file is a default feature and enabled unless otherwise specified.
Potentially more, but I'm using this to keep track. I'd expect those would get implemented over time, but if you have urgent need open a pull request or an issue! Thanks
Follow these instructions.
In brief, using bash shell:
docker pull messense/rust-musl-cross:x86_64-musl
alias rust-musl-builder='docker run --rm -it -v "$(pwd)":/home/rust/src messense/rust-musl-cross:x86_64-musl'
rust-musl-builder cargo build --release
Please note minimap2 is only tested for x86_64. Other platforms may work, please open an issue if minimap2 compiles but minimap2-rs does not.
mm2-fast
- Failhtslib
- Successsimde
- SuccessYou should cite the minimap2 papers if you use this in your work.
Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34:3094-3100. [doi:10.1093/bioinformatics/bty191][doi]
and/or:
Li, H. (2021). New strategies to improve minimap2 alignment accuracy. Bioinformatics, 37:4572-4574. [doi:10.1093/bioinformatics/btab705][doi2]
&[Vec<u8>]
instead of &Vec<Vec<u8>>
@jguhlin