Crates.io | rlz |
lib.rs | rlz |
version | 0.2.0 |
source | src |
created_at | 2022-09-12 03:14:20.494186 |
updated_at | 2022-09-12 04:39:02.240392 |
description | Relative Lempel-Ziv (RLZ): a LZ based compressor against a large static dictionary |
homepage | https://github.com/mpetri/rlz-rs |
repository | https://github.com/mpetri/rlz-rs |
max_upload_size | |
id | 663334 |
size | 58,091 |
A Relative-Lempel Ziv (RLZ) based LZ compressor that compresses against a large static dictionary.
This code implements the RLZ compressor, as described in:
@article{DBLP:journals/pvldb/HoobinPZ11,
author = {Christopher Hoobin and
Simon J. Puglisi and
Justin Zobel},
title = {Relative Lempel-Ziv Factorization for Efficient Storage
and Retrieval of Web Collections},
journal = {Proc. {VLDB} Endow.},
volume = {5},
number = {3},
pages = {265--273},
year = {2011},
}
@article{DBLP:journals/corr/PetriMNW16,
author = {Matthias Petri and
Alistair Moffat and
P. C. Nagesh and
Anthony Wirth},
title = {Access Time Tradeoffs in Archive Compression},
journal = {CoRR},
volume = {abs/1602.08829},
year = {2016},
url = {http://arxiv.org/abs/1602.08829},
eprinttype = {arXiv},
}
@inproceedings{DBLP:conf/www/LiaoPMW16,
author = {Kewen Liao and
Matthias Petri and
Alistair Moffat and
Anthony Wirth},
title = {Effective Construction of Relative Lempel-Ziv
Dictionaries},
booktitle = {Proceedings of {WWW}},
pages = {807--816},
publisher = {{ACM}},
year = {2016},
}
The Relative Lempel-Ziv (RLZ) scheme is a hybrid of several phrase-based compression mechanisms. Encoding is based on a fixed-text dictionary, with all substrings within the dictionary available for use as factors, in the style of LZ77. But the dictionary is constructed in a semi-static manner, and hence needs to be representative of the entire text being coded if compression effectiveness is not to be compromised. Furthermore, because RLZ is intended for large web-based archives when constructing the dictionary, it is infeasible to have the whole input text in memory.
use rlz::RlzCompressor;
use rlz::Dictionary;
let dict = Dictionary::from(&b"banana"[..]);
let rlz_compressor = RlzCompressor::builder().build_from_dict(dict);
let mut output = Vec::new();
let text = b"banana$aba";
let encoded_len = rlz_compressor.encode(&text[..],&mut output).unwrap();
assert_eq!(encoded_len,output.len());
let mut stored_decoder = Vec::new();
rlz_compressor.store(&mut stored_decoder).unwrap();
let loaded_decoder = RlzCompressor::load(&stored_decoder[..]).unwrap();
let mut recovered = Vec::new();
loaded_decoder.decode(&output[..],&mut recovered).unwrap();
assert_eq!(recovered,text);
MIT