Crates.io | msbwt2 |
lib.rs | msbwt2 |
version | 0.3.2 |
source | src |
created_at | 2021-09-09 16:10:39.912231 |
updated_at | 2023-02-01 13:53:23.252288 |
description | msbwt2 - multi-string BWT query library |
homepage | https://github.com/HudsonAlpha/rust-msbwt |
repository | https://github.com/HudsonAlpha/rust-msbwt |
max_upload_size | |
id | 448936 |
size | 198,752 |
The intent of crate is to provide Rust functionality for querying a Multi-String BWT (MSBWT), and is mostly based on the same methodology used by the original msbwt.
NOTE: This is very much a work-in-progress and currently only being updated as a side project during spare time. If you have any feature requests, feel free to submit a new issue on GitHub. Here is a current list of planned additions:
fmlrc2
msbwt2-build
)All installation options assume you have installed Rust along with the cargo
crate manager for Rust.
cargo install msbwt2
msbwt2-convert -h
git clone https://github.com/HudsonAlpha/rust-msbwt.git
cd rust-msbwt
#testing optional
cargo test --release
cargo build --release
./target/release/msbwt2-convert -h
The Multi-String Burrows Wheeler Transform (MSBWT or BWT) must be built prior to performing any queries. Currently, there are two ways to build the BWT with identical results:
msbwt2-build
tool.
This approach will accept any combination of FASTQ or FASTA files that may be gzip-compressed.msbwt2
to be installed:msbwt2-build \
-o comp_msbwt.npy \
reads.fq.gz [reads2.fq.gz ...]
msbwt2-convert
. This approach tends to be faster currently. However, the following command is more complex, less flexible file typing (requiring FASTQ in this example), and requires the ropebwt2 executable (or a similar tool) to be installed:gunzip -c reads.fq.gz [read2.fq.gz ...] | \
awk 'NR % 4 == 2' | \
sort | \
tr NT TN | \
ropebwt2 -LR | \
tr NT TN | \
msbwt2-convert comp_msbwt.npy
The general use case of the library is k-mer queries, which can be performed as follows:
use msbwt2::msbwt_core::BWT;
use msbwt2::rle_bwt::RleBWT;
use msbwt2::string_util;
let mut bwt = RleBWT::new();
let filename: String = "test_data/two_string.npy".to_string();
bwt.load_numpy_file(&filename);
assert_eq!(bwt.count_kmer(&string_util::convert_stoi(&"ACGT")), 1);
msbwt2
does not currently have a pre-print or paper. If you use msbwt2
, please cite the one of the msbwt
papers:
Licensed under either of
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.