small-bwt

Crates.iosmall-bwt
lib.rssmall-bwt
version0.2.0
sourcesrc
created_at2023-07-07 13:45:43.269605
updated_at2023-07-08 09:25:48.923431
descriptionBWT construction in small space
homepagehttps://github.com/kampersanda/small-bwt
repositoryhttps://github.com/kampersanda/small-bwt
max_upload_size
id910765
size2,720,709
Shunsuke Kanda (kampersanda)

documentation

https://docs.rs/small-bwt

README

BWT construction in small space

Documentation Crates.io

This is a Rust implementation of the BWT construction algorithm in small space, described in Algorithm 11.8 of the book: Compact Data Structures - A Practical Approach, Gonzalo Navarro, 2016.

Given a typical text, it runs in $O(n \log n \log \log n)$ time and $O(n)$ additional bits of space, where $n$ is the length of the input string and the alphabet size is much smaller than $n$. See the book for more details.

Documentation

https://docs.rs/small-bwt/

Command line tool

tools provides a command line tool to construct the BWT of a file.

$ cargo run --release -p tools -- -i input.txt -o output.bwt -t

With a desktop PC (Intel i7, 16 GB), the DNA text of size 385 MiB from Pizza&Chili Corpus was transformed in 6.8 minutes using maximum resident set size of 727 MiB.

Benchmarks

benches provides benchmarks on the time performance for English texts extracted from Pizza&Chili Corpus.

$ cargo bench

Licensing

This library is licensed under either of

at your option.

benches/english.10MB.zst is extracted from Pizza&Chili Corpus and follows LGPL License.

Commit count: 11

cargo fmt