Crates.io | faiquery |
lib.rs | faiquery |
version | 0.1.3 |
source | src |
created_at | 2023-08-22 18:50:06.195504 |
updated_at | 2023-08-23 18:39:27.415406 |
description | Queryable indexed fasta using a mmapped file |
homepage | |
repository | https://github.com/noamteyssier/faiquery |
max_upload_size | |
id | 951253 |
size | 30,852 |
perform interval queries on an indexed fasta file
This is a simple utility library to request interval queries on an indexed fasta file.
Ths index is assumed samtools faidx
.
The fasta file is memory-mapped and a single mutable buffer is kept for the indexed fasta reader. This buffer is used to return a slice reference to the query sequence but allows for a non-contiguous sequence (i.e. can return sequences without newlines)
Note that because a mutable buffer is kept, this is not the best approach for concurrent operations.
However, for single-threaded applications, this performs very well with low memory overhead and runtime.
use anyhow::Result;
use faiquery::{FastaIndex, IndexedFasta};
fn main() -> Result<()> {
let index = FastaIndex::from_filepath("example_data/example.fa.fai")?;
let mut faidx = IndexedFasta::new(index, "example_data/example.fa")?;
// Query the first 10 bases of chr1
let seq = faidx.query("chr1", 0, 10)?;
assert_eq!(seq, b"ACCTACGATC");
// Query the first 10 bases of chr2
let seq = faidx.query("chr2", 0, 10)?;
assert_eq!(seq, b"TTTTGATCGA");
Ok(())
}
There are other index fasta readers that are available, here is a nonexhaustive list of those:
If you are looking for a powerful concurrent reader I recommend using the faimm crate.