faiquery

Crates.iofaiquery
lib.rsfaiquery
version0.1.3
sourcesrc
created_at2023-08-22 18:50:06.195504
updated_at2023-08-23 18:39:27.415406
descriptionQueryable indexed fasta using a mmapped file
homepage
repositoryhttps://github.com/noamteyssier/faiquery
max_upload_size
id951253
size30,852
Noam Teyssier (noamteyssier)

documentation

https://docs.rs/faiquery

README

faiquery

MIT licensed actions status docs.rs

perform interval queries on an indexed fasta file

Description

This is a simple utility library to request interval queries on an indexed fasta file.

Ths index is assumed samtools faidx.

The fasta file is memory-mapped and a single mutable buffer is kept for the indexed fasta reader. This buffer is used to return a slice reference to the query sequence but allows for a non-contiguous sequence (i.e. can return sequences without newlines)

Note that because a mutable buffer is kept, this is not the best approach for concurrent operations.

However, for single-threaded applications, this performs very well with low memory overhead and runtime.

Usage

use anyhow::Result;
use faiquery::{FastaIndex, IndexedFasta};

fn main() -> Result<()> {
    let index = FastaIndex::from_filepath("example_data/example.fa.fai")?;
    let mut faidx = IndexedFasta::new(index, "example_data/example.fa")?;

    // Query the first 10 bases of chr1
    let seq = faidx.query("chr1", 0, 10)?;
    assert_eq!(seq, b"ACCTACGATC");

    // Query the first 10 bases of chr2
    let seq = faidx.query("chr2", 0, 10)?;
    assert_eq!(seq, b"TTTTGATCGA");

    Ok(())
}

Similar Approaches

There are other index fasta readers that are available, here is a nonexhaustive list of those:

If you are looking for a powerful concurrent reader I recommend using the faimm crate.

Commit count: 15

cargo fmt