Crates.io | embedded-huffman |
lib.rs | embedded-huffman |
version | 0.1.3 |
source | src |
created_at | 2024-11-05 20:23:32.76376 |
updated_at | 2024-11-07 19:15:33.362674 |
description | A paginated, streaming library for Huffman coding on embedded. |
homepage | https://github.com/qsib-cbie/embedded-heatshrink |
repository | https://github.com/qsib-cbie/embedded-heatshrink |
max_upload_size | |
id | 1437125 |
size | 73,006 |
A paginated, streaming library for Huffman coding designed for embedded systems. This library provides efficient compression and decompression with minimal memory overhead, making it suitable for resource-constrained environments.
alloc
required)Add this to your Cargo.toml
:
[dependencies]
embedded-huffman = "0.1.0"
Include extern crate alloc;
in your crate root.
Create an implementation for:
/// A function that takes a reference to the page and writes it to NAND
pub type WritePageFutureFn<E> =
Box<dyn for<'a> Fn(&'a [u8]) -> Pin<Box<dyn Future<Output = Result<(), E>> + 'a>>>;
/// A function that takes a mutable reference to the page and fills it with bytes from NAND
/// The future returns true if there are more pages that could be read
pub type ReadPageFutureFn<E> =
Box<dyn for<'a> Fn(&'a mut [u8]) -> Pin<Box<dyn Future<Output = Result<bool, E>> + 'a>>>;
The CLI tool provides simple compression and decompression capabilities:
# Compress a file
hfz < input.txt > compressed.hfz
# Decompress a file
hfz -d < compressed.hfz > decompressed.txt
# Customize page size and threshold
hfz -s 8192 -t 512 < input.txt > compressed.hfz
Basic compression example:
use embedded_huffman::{Encoder, BufferedPageWriter};
const PAGE_SIZE: usize = 2048;
const PAGE_THRESHOLD: usize = 4;
// Create encoder and writer
let mut encoder = Encoder::new(PAGE_SIZE, PAGE_THRESHOLD);
let mut writer = BufferedPageWriter::new(PAGE_SIZE, flush_fn);
// Compress data
for byte in data {
encoder.sink(byte, &mut writer).await?;
}
encoder.flush(&mut writer).await?;
Basic decompression example:
use embedded_huffman::{Decoder, BufferedPageReader};
// Create decoder and reader
let mut decoder = Decoder::new(PAGE_SIZE, PAGE_THRESHOLD);
let mut reader = BufferedPageReader::new(PAGE_SIZE, read_fn);
// Decompress data
while let Some(byte) = decoder.drain(&mut reader).await? {
// Process decompressed byte
}
The library implements a streaming Huffman coding algorithm with the following key features:
Paginated Output: Data is written in fixed-size pages suitable for NAND flash storage.
Adaptive Tables: The Huffman tree is rebuilt periodically based on symbol frequencies in the previous N pages.
Memory Efficient: Uses a minimal memory footprint with most operations being zero-copy.
Streaming Interface: Data can be processed incrementally without loading everything into memory.
The library includes benchmarks for compression and decompression performance. Run them with:
cargo bench --all-features
The codebase includes:
fuzz
directory)Run the test suite with:
cargo test
# Install
cargo install --path . --features std,cli
In this example with pipe viewer, you can see the compression and decompression rates:
jacobtrueb@Jacobs-Mac-Studio embedded-huffman % hfz < random.bin | pv > random.bin.hfz
1.00GiB 0:00:10 [96.6MiB/s] [ <=> ]
jacobtrueb@Jacobs-Mac-Studio embedded-huffman % hfz -d < random.bin.hfz | pv > random.bin.unhfz
1.00GiB 0:00:18 [56.4MiB/s] [ <=> ]
The page_threshold is approached with exponential backoff, so it may take a while to have pages spaced out as specified. To mitigate issues with non-representative frequency tables in the early data. The first Huffman table is built after 1 page, then after 2 pages, then 4, 8, 16, etc. until the threshold is reached.
///!
///! This is a simple CLI that reads bytes from stdin and writes Huffman-compressed data to stdout.
///!
///! The -d flag indicates if decompressing bytes emitted by this program.
///! The -s flag specifies the page size (must be power of 2).
///! The -t flag specifies the page threshold for rebuilding the Huffman table.
///!