| Crates.io | preflate-rs |
| lib.rs | preflate-rs |
| version | 0.7.1 |
| created_at | 2023-11-21 06:44:46.104184+00 |
| updated_at | 2025-05-21 15:51:06.140119+00 |
| description | Decompresses existing DEFLATE streams to allow for better compression (eg with ZStandard) while allowing the exact original binary DEFLATE stream to be recreated by detecting the parameters used during compression. |
| homepage | |
| repository | https://github.com/microsoft/preflate-rs |
| max_upload_size | |
| id | 1043770 |
| size | 432,837 |
Preflate-rs is a library initally based on a port of the C++ preflate library with the purpose of splitting deflate streams into uncompressed data and reconstruction information, or reconstruct the original deflate stream from those two.
Other similar libraries include precomp, reflate, grittibanzli, although this libary is probably the most feature rich and supports a lower overhead from more libraries.
IMPORTANT: This library is still in initial development, so there are probably breaking changes being done fairly frequently.
The resulting uncompressed content can then be recompressed by a more modern compression technique such as Zstd, lzma, etc. This library is designed to be used as part of a cloud storage system that requires exact binary storage of content, so the libary needs to make sure that the DEFLATE content is recreated exactly as it was written. This is not trivial, since DEFLATE has a large degree of freedom in choosing both how the distance/length pairs are chose and how the Huffman trees are created.
The library tries to detect the following compressors to try to do a reasonable job:
The general approach is as follows:
The following differences are corrected:
Note that the data formats of the recompression information are different and incompatible to the original preflate implementation, as this library uses a different arithmetic encoder (shared from the Lepton JPEG compression library).
In order to faithfully recreate the exact deflate stream, the library stores a stream of corrections to its predictive model. Depending on how good the predictive model is, the corrections can take up more or less space. If you want to improve the library, it's probably worth targetting the lower compression levels that currently have significant overhead.
The amount of overhead vs uncompressed data is approximately the following, depending on the compression level. If you want to benefit from using this library, whatever better compression algorithm you use needs to be at least that much better to make it worthwhile to recompress.
| Library | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|---|
| zlib | 0.01% | 0.01% | 0.01% | 0.01% | 0.01% | 0.01% | 0.01% | 0.08% | 0.03% | 0.01% |
| libngz | 0.01% | 0.01% | 0.01% | 0.01% | 0.97% | 1.07% | 0.90% | 0.01% | 0.01% | NoCompressionCandidates |
| libdeflate | 0.01% | 0.25% | 1.04% | 0.91% | 1.51% | 1.04% | 0.96% | 0.87% | 1.04% | 1.03% |
| miniz_oxide | 0.01% | 0.06% | 2.70% | 1.78% | 0.53% | 0.30% | 0.09% | 0.06% | 0.08% | 0.07% |
git clone https://github.com/microsoft/preflate-rs
cd preflate-rs
cargo build
cargo test
cargo build --release
There is an preflate_util.exe wrapper that is built as part of the project that can be used to
test out the library against Deflate compressed content.
There are many ways in which you can participate in this project, for example:
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
Copyright (c) Microsoft Corporation. All rights reserved.
Licensed under the Apache 2.0 license.