fsst-rust

Crates.iofsst-rust
lib.rsfsst-rust
version0.1.1
sourcesrc
created_at2024-08-30 11:11:12.264837
updated_at2024-08-30 11:31:49.104978
descriptionFSST-Rust is a pure rust implementation of the Fast Static Symbol Table
homepage
repositoryhttps://github.com/Morgan279/FSST-Rust
max_upload_size
id1357579
size7,996,542
Wancheng Long (Morgan279)

documentation

https://docs.rs/fsst-rust

README

FSST-Rust

Crates.io Documentation License

A pure rust implementation of the Fast Static Symbol Table.

Quick to use

use cargo add

cargo add fsst-rust

or add fsst-rust to Cargo.toml

[dependencies]
fsst-rust = "0.1.1"

Usage

de/compress a single string

let str = "tumcwitumvldb";
let (symbol_table, encoding) = encode_string(str, false);
println!("built symbol table: {}", symbol_table.to_string()); // [b, t, w, tumc, witumvld]
assert_eq!(str, decode_string(&symbol_table, &encoding));

let table_bytes = symbol_table.dump();
let decoder = Decoder::from_table_bytes( & table_bytes);
assert_eq!(str, decoder.decode(&encoding));

let compress_factor = str.len() as f64 / encoding.len() as f64;
println!("compression factor: {:.4}", compress_factor);

batch de/compress strings

let compress_file_path = "assets/test_data/ps_comment".to_string();
let strings = read_string_lines(compress_file_path).unwrap();

let mut start_time = std::time::Instant::now();
let (symbol_table, encodings) = encode_all_strings( & strings);
let compress_time = start_time.elapsed();

start_time = std::time::Instant::now();
let decode_strings = decode_all_strings( & symbol_table, & encodings);
let decompress_time = start_time.elapsed();

let mut encoding_size = symbol_table.dump().len();
let mut total_size = 0;
for i in 0..strings.len() {
    total_size += strings[i].len();
    encoding_size += encodings[i].len();
    assert_eq ! (strings[i], decode_strings[i]);
}

let compress_factor = total_size as f64 / encoding_size as f64;
println!("compression factor: {:.4}", compress_factor);
println!("compression cost time: {}ms", compress_time.as_millis());
println!("decompression cost time: {}ms", decompress_time.as_millis());

MircoBench

cargo bench

MircoBench Environment:

Architecture:        x86_64
Model name:          Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

compression on TPC-H data of ps_comment at string level

decompression on TPC-H data of l_comment at string level

License

FSST-Rust is under the Apache-2.0 license. See the LICENSE file for details.

Acknowledgement

Thanks to the authors Peter Boncz, Thomas Neumann, and Viktor Leis for providing the powerful FSST. Here is the original algorithm repository: https://github.com/cwida/fsst.

Commit count: 0

cargo fmt