| Crates.io | whispr |
| lib.rs | whispr |
| version | 0.2.0 |
| created_at | 2025-12-14 19:34:42.294848+00 |
| updated_at | 2025-12-15 00:41:41.811702+00 |
| description | A general-purpose voice <-> crate — text-to-speech, speech-to-text, and audio-to-audio transformations. Also supports realtime conversations. |
| homepage | |
| repository | https://github.com/iamnbutler/whispr |
| max_upload_size | |
| id | 1984991 |
| size | 217,463 |
A general-purpose voice <-> crate — text-to-speech, speech-to-text, and audio-to-audio transformations. Also supports realtime conversations.
Whispr provides a clean, ergonomic API for working with audio AI services. It's designed to be provider-agnostic, though only openai is currently implemented.
[dependencies]
whispr = "0.1"
tokio = { version = "1", features = ["full"] }
use whispr::{Client, TtsModel, Voice};
#[tokio::main]
async fn main() -> Result<(), whispr::Error> {
let client = Client::from_env()?; // reads OPENAI_API_KEY
// Text to Speech
let audio = client
.speech()
.text("Hello, world!")
.voice(Voice::Nova)
.generate()
.await?;
std::fs::write("hello.mp3", &audio)?;
Ok(())
}
Convert text to natural-sounding audio with multiple voices and customization options.
use whispr::{Client, TtsModel, Voice, AudioFormat, prompts};
let client = Client::from_env()?;
let audio = client
.speech()
.text("Welcome to whispr!")
.voice(Voice::Nova)
.model(TtsModel::Gpt4oMiniTts)
.format(AudioFormat::Mp3)
.speed(1.0)
.instructions(prompts::FITNESS_COACH) // Voice personality (gpt-4o-mini-tts only)
.generate()
.await?;
std::fs::write("output.mp3", &audio)?;
Available Voices: Alloy, Ash, Ballad, Coral, Echo, Fable, Nova, Onyx, Sage, Shimmer, Verse
Available Models:
Gpt4oMiniTts — Latest model with instruction supportTts1 — Optimized for speedTts1Hd — Optimized for qualityTranscribe audio files to text with optional language hints.
let result = client
.transcription()
.file("recording.mp3").await?
.language("en")
.transcribe()
.await?;
println!("Transcription: {}", result.text);
From bytes (useful for recorded audio):
let wav_data: Vec<u8> = record_audio();
let result = client
.transcription()
.bytes(wav_data, "recording.wav")
.transcribe()
.await?;
Transcribe audio and generate new speech in one call — useful for voice transformation, translation, or processing pipelines.
let (transcription, audio) = client.audio_to_audio("input.mp3").await?;
println!("Said: {}", transcription.text);
std::fs::write("output.mp3", &audio)?;
For real-time applications, stream audio as it's generated:
use futures::StreamExt;
let mut stream = client
.speech()
.text("This is a longer text that will be streamed...")
.generate_stream()
.await?;
while let Some(chunk) = stream.next().await {
let bytes = chunk?;
// Process audio chunk in real-time
}
The prompts module includes pre-built voice personalities for common use cases:
use whispr::prompts;
client.speech()
.text("Let's get moving!")
.model(TtsModel::Gpt4oMiniTts)
.instructions(prompts::FITNESS_COACH)
.generate()
.await?;
MIT License — see LICENSE for details.