| Crates.io | howfast |
| lib.rs | howfast |
| version | 0.1.1 |
| created_at | 2025-09-23 18:55:55.994375+00 |
| updated_at | 2025-09-23 19:03:19.581738+00 |
| description | Small CLI tool that measures token metrics and completion tokens per second for an Ollama model response |
| homepage | https://github.com/spinualexandru/howfast |
| repository | https://github.com/spinualexandru/howfast |
| max_upload_size | |
| id | 1851971 |
| size | 93,087 |
A small CLI tool that measures token metrics (prompt tokens, completion tokens, total tokens, and tokens-per-second) for an Ollama model response. It queries an Ollama server, then prints a nicely formatted, colored summary. The actual model text response is hidden by default; pass --with-response to show it.

cargo install howfast
--with-response.ollama-rs).localhost:11434)OLLAMA_HOST (optional) — host address of the Ollama server (defaults to localhost). Do not include the port; the program always uses port 11434.Example:
# set before running, if your Ollama server isn't on localhost
export OLLAMA_HOST=192.168.1.100
or
OLLAMA_HOST=192.168.1.100 howfast gemma3:4b "Tell me a joke"