| Crates.io | parallel_bzip2_decoder |
| lib.rs | parallel_bzip2_decoder |
| version | 0.2.0 |
| created_at | 2025-12-06 16:38:44.671504+00 |
| updated_at | 2025-12-11 19:39:07.068942+00 |
| description | High-performance parallel bzip2 decompression library |
| homepage | https://github.com/parallel-bz2/parallel-bz2 |
| repository | https://github.com/parallel-bz2/parallel-bz2 |
| max_upload_size | |
| id | 1970399 |
| size | 108,318 |
A high-performance, parallel bzip2 decoder for Rust.
This crate provides a Bz2Decoder that implements std::io::Read, allowing you to decompress bzip2 files in parallel using multiple CPU cores. It is designed to work efficiently with both single-stream (standard) and multi-stream (e.g., pbzip2) bzip2 files by scanning for block boundaries and decompressing them concurrently.
rayon to decompress blocks in parallel.std::io::Read for easy integration.Arc).anyhow integrationAdd this to your Cargo.toml:
[dependencies]
parallel_bzip2_decoder = "0.1"
The easiest way to use parallel_bzip2_decoder is to use Bz2Decoder::open, which handles memory mapping internally:
use parallel_bzip2_decoder::Bz2Decoder;
use std::io::Read;
fn main() -> anyhow::Result<()> {
let mut decoder = Bz2Decoder::open("input.bz2")?;
let mut buffer = Vec::new();
decoder.read_to_end(&mut buffer)?;
println!("Decompressed {} bytes", buffer.len());
Ok(())
}
If you already have the data in memory (e.g., an Arc<[u8]> or Arc<Mmap>), you can use Bz2Decoder::new:
use parallel_bzip2_decoder::Bz2Decoder;
use std::io::Read;
use std::sync::Arc;
fn main() -> anyhow::Result<()> {
let data: Vec<u8> = vec![/* ... bzip2 data ... */];
let data_arc = Arc::new(data);
let mut decoder = Bz2Decoder::new(data_arc);
let mut buffer = Vec::new();
decoder.read_to_end(&mut buffer)?;
Ok(())
}
parallel_bzip2_decoder scales linearly with the number of available CPU cores. It is significantly faster than standard single-threaded decoders for large files.
This crate includes comprehensive benchmarks and profiling tools:
# Run all benchmarks
cargo bench
# Run specific benchmark suite
cargo bench --bench decode_benchmark
cargo bench --bench scanner_benchmark
cargo bench --bench e2e_benchmark
# CPU profiling with flamegraphs
cd ../scripts
./profile_cpu.sh
# Memory profiling with valgrind
./profile_memory.sh
For detailed instructions, see BENCHMARKING.md.
This crate follows semantic versioning. Breaking changes will only occur with major version updates.
MIT
See the main repository's CONTRIBUTING.md for details on how to contribute.
See CHANGELOG.md for a history of changes (when available).