Crates.io | include_assets |
lib.rs | include_assets |
version | 1.0.0 |
source | src |
created_at | 2023-06-02 14:05:23.133935 |
updated_at | 2023-06-02 14:05:23.133935 |
description | include compressed assets in a Rust executable |
homepage | |
repository | |
max_upload_size | |
id | 880870 |
size | 21,272 |
include_assets
in your executableinclude_assets
provides convenient ways to include assets (arbitrary files) in a Rust binary.
Assets are compressed and can either be looked up by file name or by variants of an enumeration.
This is probably the most straightforward approach.
Include all files in a directory with the include_dir!()
macro, load (decompress) the assets at runtime using NamedArchive::load
.
Once they are loaded, use the NamedArchive
more or less as you would a HashMap<&str, &[u8]>
.
For examples, see the docs and examples/named/src/main.rs
.
This approach might be a little unusual.
Declare an enum with one unit variant per asset, and derive the trait EnumAsset
using the derive macro that comes with this crate.
Load the uncompressed assets using EnumArchive::<MyEnum>::load()
(replacing MyEnum
with whatever name you chose for your enum).
Then look up the asset data via indexing (&archive[MyAsset::SomeVariant]
) - this is infallible!
This approach has two distinct advantages:
A disadvantage is that you cannot iterate over assets. Additionally, their names are erased at runtime.
Despite the lack of iteration, asset data can be mapped using AssetEnum::map
.
You can then look up assets by enum variant in the resulting EnumMap
.
This may be useful for homogenous assets, you could for example parse templates, decode sound/image files, &c.
Note that you can have multiple EnumAsset
s in the same program; mapping will be more useful if you have different enums for different types of asset.
For examples, see the docs and examples/enums/src/main.rs
.
If you want to rebuild the executable whenever one of the assets changes, you should use a build.rs
like this:
fn main() {
println!("cargo:rerun-if-changed=path/to/assets");
println!("cargo:rerun-if-changed=more/assets");
}
This crate is licensed under the LGPL v3.
Currently supported: zstd, lz4, deflate, no compression.
At compile time, a checksum is computed for each asset. These checksums are included in the binary. When loading/decompressing assets, the checksum of decompressed assets is compared against the compile-time checksum as a measure against data corruption and (more importantly) bugs.
Currently, blake2b is used for this, but this may change in the future.
At runtime, main memory needs to be big enough to hold all assets at the same time in compressed and uncompressed form. At compile time, main memory needs to be big enough to hold all assets at the same time in compressed form and twice in uncompressed form.
The total size of each asset archive cannot exceed u32::MAX
(4 GiB).
Each asset archive can contain at most u32::MAX
(roughly 4e9) distinct assets.
If your use case exceeds these limits, reconsider if this is really the right approach.
usize
is required to be at least 32 bits wide.
Rust core
includes the include_bytes!
macro which allows including a single file (uncompressed).
There are several crates which allow including compressed files, and even directories.
As far as I know, this crate is the only one which compresses included files as a whole rather than seperately. This approach has a significant disadvantage: To decompress a single file, all files have to be decompressed. However, it leads to better compression because the compression algorithm can take advantage of similarities between files in addition to similarities within each file.
Compile times suck.
I'm not sure how the include_bytes!
macro works, but it probably doesn't just dump a huge bytestring into the AST.
I'd love to write the data to OUT_DIR
and then include that blob with include_bytes!
, but that's not available in proc macros.
As a workaround it may be (more?) useful to generate code and a compressed blob from build.rs
, then include!
it from the main code.
If the assets are big, decompression can be rather slow. It may be worth investigating zstandard (and lz4) dictionary compression. At compile time, a dictionary can be created by analyzing each asset. Then, each asset can be compressed independently using this dictionary. Hopefully, this wouldn't result in significantly bigger files than compressing them together. The advantage of this approach is that runtime compression could be done in parallel using multiple threads. Alternatively, each file could be decompressed as needed, but this is not currently a goal of this crate.
It may be useful to deduplicate assets based on contents before compression.
Compression can obviously reduce size but only if the redundant files are not too far apart.
This can be implemented with a layer of indirection: asset name maps to a blob id, blob id maps to a data range.
Currently deduplication is best achieved in application code: archive.get(override_asset).unwrap_or_else(|| archive[fallback_asset])
It may be useful to provide options to use other checksum algorithms.
Possible options: CRC for smaller hashes, SHA256 for (possibly) faster hashing (special CPU instructions!), [u8; 0]
to effectively disable checking.
The choice is probably best handled through feature flags.
Error handling of the macros could use some work, but this is blocked on stable Rust allowing proc macro diagnostics.
Bug reports are very welcome. Feature requests are also welcome, but no promises. I do not plan to accept patches at this time.