| Crates.io | hyprwhspr-rs |
| lib.rs | hyprwhspr-rs |
| version | 0.3.15 |
| created_at | 2025-10-21 17:36:13.82949+00 |
| updated_at | 2026-01-22 07:42:11.957105+00 |
| description | Native speech-to-text voice dictation for Hyprland (Rust implementation) |
| homepage | https://github.com/better-slop/hyprwhspr-rs |
| repository | https://github.com/better-slop/hyprwhspr-rs |
| max_upload_size | |
| id | 1894189 |
| size | 561,095 |
Rust implementation of hyprwhspr | Blazing fast, native speech-to-text voice dictation for Hyprland and Omarchy | Waybar, Walker, and Elephant integrations (optional)
cargo install hyprwhspr-rs
https://github.com/user-attachments/assets/bbbaa1c3-1a7e-4165-ad3d-27b7465e201a
libudev-dev on Debian/Ubuntu)./scripts/download-parakeet-tdt.sh to download model files (~1.2GB)fast_vad.enabled) audio files, reducing inferences costs while increasing output speedHYPRLAND_INSTANCE_SIGNATURE and opens the IPC socket at $XDG_RUNTIME_DIR/hypr/<signature>/.socket.sock.dispatch sendshortcut commands against the active window to paste dictated text, inspecting activewindow to decide when Shift is required for a hardcoded list of programs.cargo install hyprwhspr-rs
--with-elephant flag)# Interactive install
hyprwhspr-rs install
# Optionally, install specific components (systemd, waybar, elephant)
hyprwhspr-rs install {--all| --service | --waybar | --elephant} {--force | -f}
git clone https://github.com/better-slop/hyprwhispr-rs.gitcd hyprwhspr-rscargo build --releasesudo cp target/release/hyprwhspr-rs /usr/local/bin/./scripts/install-waybar.sh
Installs systemd service, Waybar module, and CSS styles. Shows mic status in your bar.
git clone https://github.com/better-slop/hyprwhispr-rs.gitcd hyprwhspr-rscargo build --releaseRUST_LOG=debug ./target/release/hyprwhspr-rs./target/release/hyprwhspr-rsConfigure in ~/.config/hyprwhspr-rs/config.jsonc
{
"shortcuts": {
"press": "SUPER+ALT+D",
"hold": "SUPER+ALT+CTRL",
},
"word_overrides": {
"under score": "_",
"em dash": "—",
"equal": "=",
"at sign": "@",
"pound": "#",
"hashtag": "#",
"hash tag": "#",
"newline": "\n",
"Omarkey": "Omarchy",
"dot": ".",
"Hyperland": "hyprland",
"hyperland": "hyprland",
},
"audio_feedback": true, // Play start/stop sounds while recording
"start_sound_volume": 0.1, // 0.1 - 1.0
"stop_sound_volume": 0.1, // 0.1 - 1.0
"start_sound_path": null, // Optional custom audio asset overrides
"stop_sound_path": null, // Optional custom audio asset overrides
"auto_copy_clipboard": true, // Automatically copy the final transcription to the clipboard
"shift_paste": false, // Whether to force shift paste
"global_paste_shortcut": false, // Enable the compositor-level paste shortcut (Omarchy's addition)
"paste_hints": {
"shift": [
// Optional list of Hyprland window classes that should always paste with Ctrl+Shift+V
],
},
"audio_device": null, // Force a specific input device index (null uses system default)
"fast_vad": {
"enabled": false, // Enable Earshot fast VAD trimming
"profile": "aggressive", // quality | low_bitrate | aggressive | very_aggressive (lowercase only, serde-enforced; default aggressive)
"min_speech_ms": 120, // Minimum detected speech before keeping a segment
"silence_timeout_ms": 500, // Drop silence longer than this (ms)
"pre_roll_ms": 120, // Audio to keep before speech to avoid clipping words
"post_roll_ms": 150, // Audio to keep after speech before trimming
"volatility_window": 24, // Frames observed for adaptive aggressiveness (30 ms per frame, matches FRAME_MS in src/audio/vad.rs)
"volatility_increase_threshold": 0.35, // Bump profile when toggles exceed this ratio
"volatility_decrease_threshold": 0.12, // Relax profile when toggles stay below this ratio
},
"transcription": {
"provider": "whisper_cpp", // whisper_cpp | groq | gemini | parakeet
"request_timeout_secs": 45,
"max_retries": 2,
"whisper_cpp": {
"prompt": "Transcribe as technical documentation with proper capitalization, acronyms, and technical terminology. Do not add punctuation.",
"model": "large-v3-turbo-q8_0", // Whisper model to use (must exist in specified directories)
"threads": 4, // CPU threads dedicated to whisper.cpp
"gpu_layers": 999, // Number of layers to keep on GPU (999 = auto/GPU preferred)
"fallback_cli": false, // Fallback to whisper-cli (uses CPU)
"no_speech_threshold": 0.6, // Whisper's "no speech" confidence gate
"models_dirs": ["~/.config/hyprwhspr-rs/models"], // Directories to search for models
"vad": {
"enabled": false, // Toggle whisper-cli's native Silero VAD
"model": "ggml-silero-v5.1.2.bin", // Path or filename for the ggml Silero VAD model
// Probability threshold for deciding a frame is speech. Higher = fewer false positives, but may miss quiet speech.
"threshold": 0.5,
// Minimum contiguous speech duration (ms) to accept. Increase to ignore quick clicks/taps.
"min_speech_ms": 250,
// Minimum silence gap (ms) required to end a speech segment. Raise if mid-sentence pauses are being split.
"min_silence_ms": 120,
// Maximum speech duration (seconds) before forcing a cut. Use null (or omit) to leave unlimited.
"max_speech_s": 15.0,
// Extra padding (ms) added before/after detected speech so words aren't clipped.
"speech_pad_ms": 80,
// Overlap ratio between segments. Higher overlap helps smooth transitions at the cost of a little extra decode time.
"samples_overlap": 0.1,
},
},
"groq": {
"model": "whisper-large-v3-turbo",
"endpoint": "https://api.groq.com/openai/v1/audio/transcriptions",
"prompt": "Transcribe as technical documentation with proper capitalization, acronyms, and technical terminology. Do not add punctuation.",
},
"gemini": {
"model": "gemini-2.5-flash-preview-09-2025",
"endpoint": "https://generativelanguage.googleapis.com/v1beta/models",
"temperature": 0.0,
"max_output_tokens": 1024,
"prompt": "Transcribe as technical documentation with proper capitalization, acronyms, and technical terminology. Do not add punctuation.",
},
"parakeet": {
"model_dir": "models/parakeet/parakeet-tdt-0.6b-v3-onnx", // Relative to $XDG_DATA_HOME/hyprwhspr-rs (or ~/.local/share/hyprwhspr-rs)
"prompt": "Transcribe as technical documentation with proper capitalization, acronyms, and technical terminology. Do not add punctuation.",
},
},
}
The default build ships with the earshot VoiceActivityDetector baked in. Toggle fast_vad.enabled in your config to trim silence before any provider (whisper.cpp, Groq, Gemini) sees the audio. Extremely useful for lowering costs and increasing speed.
All other fields in the fast_vad block map directly to the trimmer’s behaviour, so you can tune aggressiveness without
recompiling.
Runtime builds rely on a local whisper.cpp installation, so validate that dependency before shipping a tagged version.
fix: bumps patch, feat: bumps minor, and type!: indicates a breaking change (major bump).main, the release-plz workflow runs release-pr to open or refresh a release-plz-* pull request. Review the proposed version and changelog there.release-plz release, tagging (vX.Y.Z) and publishing the crate to crates.io if it’s a stable tag.release workflow, which builds the Linux GNU binary, uploads the tarball + checksum, and publishes the GitHub release with the full commit list (plus PR links when available).Define
CARGO_REGISTRY_TOKENin the repository secrets with publish-only permissions so the workflow can push stable releases to crates.io.