| Crates.io | scribble |
| lib.rs | scribble |
| version | 0.5.3 |
| created_at | 2020-06-01 21:21:58.026817+00 |
| updated_at | 2026-01-20 01:27:54.6236+00 |
| description | High-level Rust API for audio transcription using Whisper |
| homepage | |
| repository | https://github.com/itsmontoya/scribble |
| max_upload_size | |
| id | 248833 |
| size | 1,963,642 |
Scribble is a fast, lightweight transcription engine written in Rust, with a built-in Whisper backend and a backend trait for custom implementations.

Scribble will demux/decode audio or video containers (MP4, MP3, WAV, FLAC, OGG, WebM, MKV, etc.), downmix to mono, and resample to 16 kHz — no preprocessing required.
Scribble is built with streaming and real-time transcription in mind, even when operating on static files today.
Scribble targets Rust stable (tracked via rust-toolchain.toml).
Clone the repository and build the binaries:
./scripts/build-all.sh
Or build a single binary to a target directory:
./scripts/build.sh scribble-cli ./dist
This will produce the following binaries:
scribble-cli — transcribe audio/video (decodes + normalizes to mono 16 kHz)scribble-server — HTTP server for transcriptionmodel-downloader — download Whisper and VAD modelsScribble exposes whisper-rs GPU backend features as Cargo features. Enable the
backend you want in Cargo.toml or via --features:
[dependencies]
scribble = { version = "0.5", features = ["cuda"] }
cargo run --features "bin-scribble-cli,cuda" --bin scribble-cli -- \
--model ./models/ggml-large-v3-turbo.bin \
--input ./input.wav
Available GPU feature flags:
cuda (NVIDIA CUDA)metal (Apple Metal)hipblas (AMD ROCm)vulkan (Vulkan)coreml (Apple CoreML)These are passthrough features; you still need the corresponding system dependencies installed for your platform. See the whisper-rs documentation for backend setup details.
model-downloader is a small helper CLI for downloading known-good Whisper and Whisper-VAD models into a local directory.
cargo run --features bin-model-downloader --bin model-downloader -- --list
Example output:
Whisper models:
- tiny
- base.en
- large-v3-turbo
- large-v3-turbo-q8_0
...
VAD models:
- silero-v5.1.2
- silero-v6.2.0
cargo run --features bin-model-downloader --bin model-downloader -- --name large-v3-turbo
By default, models are downloaded into ./models.
cargo run --features bin-model-downloader --bin model-downloader -- \
--name silero-v6.2.0 \
--dir /opt/scribble/models
Downloads are performed safely:
*.partscribble-cli is the main transcription CLI.
It accepts audio or video containers and normalizes them to Whisper’s required mono 16 kHz internally. Provide:
- to stream from stdin--enable-vad is set)cargo run --features bin-scribble-cli --bin scribble-cli -- \
--model ./models/ggml-large-v3-turbo.bin \
--vad-model ./models/ggml-silero-v6.2.0.bin \
--input ./input.mp4
Output is written to stdout in WebVTT format by default.
scribble-cli (via ffmpeg)If you have a live audio stream URL (MP3/AAC/etc.), you can decode it to Whisper-friendly WAV and pipe it into scribble-cli via stdin:
ffmpeg -re -loglevel error -nostats \
-i "https://stream.example.com/live.mp3?session-id=REDACTED" \
-f wav -ac 1 -ar 16000 - \
| scribble-cli \
--model ./models/ggml-tiny.bin \
--vad-model ./models/ggml-silero-v6.2.0.bin \
--enable-vad \
--input -
scribble-cli (via streamlink + ffmpeg)If you have streamlink installed, you can pull a Twitch stream to stdout and feed it through ffmpeg:
streamlink --stdout https://www.twitch.tv/dougdoug best \
| ffmpeg -hide_banner -loglevel error -i pipe:0 -vn -ac 1 -ar 16000 -f wav pipe:1 \
| scribble-cli \
--model ./models/ggml-tiny.bin \
--vad-model ./models/ggml-silero-v6.2.0.bin \
--enable-vad \
--input -
scribble-server is a long-running HTTP server that loads models once and accepts transcription requests over HTTP.
cargo run --features bin-scribble-server --bin scribble-server -- \
--model ./models/ggml-large-v3-turbo.bin \
--vad-model ./models/ggml-silero-v6.2.0.bin \
--host 127.0.0.1 \
--port 8080
curl -sS --data-binary @./input.mp4 \
"http://127.0.0.1:8080/transcribe?output=vtt" \
> transcript.vtt
For JSON output:
curl -sS --data-binary @./input.wav \
"http://127.0.0.1:8080/transcribe?output=json" \
> transcript.json
Example using all query params:
curl -sS --data-binary @./input.mp4 \
"http://127.0.0.1:8080/transcribe?output=json&output_type=json&model_key=ggml-large-v3-turbo.bin&enable_vad=true&translate_to_english=true&language=en" \
> transcript.json
scribble-server exposes Prometheus metrics at GET /metrics.
curl -sS "http://127.0.0.1:8080/metrics"
Key metrics:
scribble_http_requests_total (labels: status)scribble_http_request_duration_seconds (labels: status)scribble_http_in_flight_requestsAll binaries emit structured JSON logs to stderr.
errorSCRIBBLE_LOG (e.g. SCRIBBLE_LOG=info)cargo run --features bin-scribble-cli --bin scribble-cli -- \
--model ./models/ggml-large-v3-turbo.bin \
--vad-model ./models/ggml-silero-v6.2.0.bin \
--input ./input.wav \
--output-type json
cargo run --features bin-scribble-cli --bin scribble-cli -- \
--model ./models/ggml-large-v3-turbo.bin \
--vad-model ./models/ggml-silero-v6.2.0.bin \
--enable-vad \
--input ./input.wav
When VAD is enabled:
cargo run --features bin-scribble-cli --bin scribble-cli -- \
--model ./models/ggml-large-v3-turbo.bin \
--vad-model ./models/ggml-silero-v6.2.0.bin \
--input ./input.wav \
--language en
If --language is omitted, Whisper will auto-detect.
cargo run --features bin-scribble-cli --bin scribble-cli -- \
--model ./models/ggml-large-v3-turbo.bin \
--vad-model ./models/ggml-silero-v6.2.0.bin \
--input ./input.wav \
--output-type vtt \
> transcript.vtt
Scribble is also designed to be embedded as a library.
High-level usage looks like:
use scribble::{Opts, OutputType, Scribble};
use std::fs::File;
let mut scribble = Scribble::new(
["./models/ggml-large-v3-turbo.bin"],
"./models/ggml-silero-v6.2.0.bin",
)?;
let mut input = File::open("audio.wav")?;
let mut output = Vec::new();
let opts = Opts {
model_key: None,
enable_translate_to_english: false,
enable_voice_activity_detection: true,
language: None,
output_type: OutputType::Json,
incremental_min_window_seconds: 1,
};
scribble.transcribe(&mut input, &mut output, &opts)?;
let json = String::from_utf8(output)?;
println!("{json}");
This project uses cargo-llvm-cov for coverage locally and in CI.
One-time setup:
rustup component add llvm-tools-preview
cargo install cargo-llvm-cov
Run coverage locally:
# Print a summary to stdout
cargo llvm-cov --features bin-scribble-cli,bin-model-downloader,bin-scribble-server --all-targets
# Generate an HTML report (writes to ./target/llvm-cov/html)
cargo llvm-cov --features bin-scribble-cli,bin-model-downloader,bin-scribble-server --all-targets --html
Scribble is under active development. The API is not yet stable, but the foundations are in place and evolving quickly.
Release notes live in CHANGELOG.md (and GitHub Releases).
See STYLEGUIDE.md for code style, verification conventions, and repo-level checklists.
MIT
