# fast-smaz Pure Rust implementation of smaz algorithm, very close to the original [antirez/smaz](https://github.com/antirez/smaz) C implementation with some differences. ## Example ```rust use fast_smaz::Smaz; fn main() { let my_message = "Hello world!"; let compressed = my_message.smaz_compress(); let decompressed = compressed.smaz_decompress().unwrap(); assert_eq!(my_message.as_bytes(), decompressed); } ``` ## Inspiration This implementation was inspired in both original [antirez/smaz](https://github.com/antirez/smaz) and the [peterc/rsmaz](https://github.com/peterc/rsmaz), a Ruby implementation of smaz. ## Comparison ### fast-smaz and [antirez/smaz](https://github.com/antirez/smaz) This implementation is very similar to the original ANSI C implementation of [SMAZ](https://github.com/antirez/smaz), it uses the exact same hash-table and hashing algorithm. I haven't measured how performance compares, but the ANSI C implementation can do a lot of assumptions both tied to ANSI C specification and correctness of the implementation, while this Rust implementation has some additional checks to ensure the safetyness of the code, which may impact the performance, but should not be that impactful. ### fast-smaz and [silentsokolov/rust-smaz](https://github.com/silentsokolov/rust-smaz) [silentsokolov/rust-smaz](https://github.com/silentsokolov/rust-smaz) uses `lazy_static!` and creates the codebook hash table at first execution, instead of shipping a pre-generated hash table. This translates into a slower encoding process (benchmarks on my computer shows 73% slower encoding), not only because of the overhead of `lazy_static!` initialization, but also the overhead of the atomic operation for initialization check and the hash distribution itself. > I've choosen to keep antirez original bucket-size and distribution because it looked well distributed, > and performance results shows that it does have a very good performance. Note that changing `lazy_static!` to [`OnceLock`](https://doc.rust-lang.org/nightly/std/sync/struct.OnceLock.html) does not help, since both are very similar. The performance improvement only applies to the encoding process, the decoding process has a similar performance, since both [silentsokolov/rust-smaz](https://github.com/silentsokolov/rust-smaz) and **fast-smaz** share a similar code, which is a simple lookup to the reverse codebook. Also, **fast-smaz** uses `const` for the codebook and reverse codebook, which allows the compiler to do aggressive optimizations, this lowers cache-misses and RAM accesses, since `const` variables can (and probably will) be inlined at the use-site. ### Features `custom-cookbook`: For development only, was used as a testing codebook before the proper translation of the origional C implementation codebook. It's way more inefficient and with a poor distribution. It was constructed using the `src/hashing.rs` algorithm, I didn't wast too much time improving it and ensuring its correctness, it was only used to generate a correct and functional codebook for testing. Don't use this.