hf-mem

Crates.iohf-mem
lib.rshf-mem
version0.0.5
created_at2025-03-10 11:00:18.277163+00
updated_at2025-07-14 17:13:42.877915+00
descriptionCLI to estimate inference memory requirements from the Hugging Face Hub
homepage
repositoryhttps://github.com/alvarobartt/hf-mem
max_upload_size
id1586440
size76,057
Alvaro Bartolome (alvarobartt)

documentation

README

hf-mem

Crates.io

CLI to estimate inference memory requirements from the Hugging Face Hub

Install

$ cargo install hf-mem

Usage

$ hf-mem --help
CLI to estimate inference memory requirements from the Hugging Face Hub

Usage: hf-mem [OPTIONS] --model-id <MODEL_ID>

Options:
  -m, --model-id <MODEL_ID>  ID of the model on the Hugging Face Hub
  -r, --revision <REVISION>  Revision of the model on the Hugging Face Hub [default: main]
  -t, --token <TOKEN>        Hugging Face Hub token with read access over the provided model ID, optional
  -d, --dtype <DTYPE>        Target dtype for conversion (float32, float16, bfloat16, float8, float4)
  -h, --help                 Print help
  -V, --version              Print version

Features

  • Fast and light CLI with a single installable binary
  • Fetches just the required bytes from the safetensors files on the Hugging Face Hub that contain the metadata
  • Provides an estimation based on the count of the parameters on the different dtypes
  • Supports both sharded i.e. model-00000-of-00000.safetensors and not sharded i.e. model.safetensors files

What's next?

  • Add tracing and progress bars when fetching from the Hub
  • Support other file types as e.g. gguf
  • Read metadata from local files if existing, instead of just fetching from the Hub every single time
  • Add more flags to support estimations assuming quantization, extended context lengths, any added memory overhead, etc.

License

This project is licensed under either of the following licenses, at your option:

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Commit count: 33

cargo fmt