howfast

Crates.io	howfast
lib.rs	howfast
version	0.1.1
created_at	2025-09-23 18:55:55.994375+00
updated_at	2025-09-23 19:03:19.581738+00
description	Small CLI tool that measures token metrics and completion tokens per second for an Ollama model response
homepage	https://github.com/spinualexandru/howfast
repository	https://github.com/spinualexandru/howfast
max_upload_size
id	1851971
size	93,087

Alex Spinu (spinualexandru)

documentation

README

howfast

A small CLI tool that measures token metrics (prompt tokens, completion tokens, total tokens, and tokens-per-second) for an Ollama model response. It queries an Ollama server, then prints a nicely formatted, colored summary. The actual model text response is hidden by default; pass --with-response to show it.

Demo

Installation

cargo install howfast

Features

Call an Ollama model and collect token metrics.
See Tokens per second (completion)
Nicely formatted, colored terminal output (no box/borders).
Response hidden by default; opt-in with --with-response.
Uses Tokio runtime (required by ollama-rs).

Requirements

Rust (1.70+ recommended)
Cargo
An Ollama server reachable from your machine (default localhost:11434)

Environment

OLLAMA_HOST (optional) — host address of the Ollama server (defaults to localhost). Do not include the port; the program always uses port 11434.

Example:

# set before running, if your Ollama server isn't on localhost
export OLLAMA_HOST=192.168.1.100

OLLAMA_HOST=192.168.1.100 howfast gemma3:4b "Tell me a joke"

Commit count: 6

howfast

documentation

README

howfast

Installation

Features

Requirements

Environment

cargo fmt