elevenlabs_stt

Crates.ioelevenlabs_stt
lib.rselevenlabs_stt
version0.0.5
created_at2025-08-29 11:11:53.14604+00
updated_at2025-09-01 16:54:51.411011+00
descriptionType-safe Rust client for ElevenLabs Speech-To-Text API
homepage
repositoryhttps://github.com/hamzaelmarjani/elevenlabs_stt
max_upload_size
id1815705
size218,523
Hamza El Marjani (hamzaelmarjani)

documentation

README

elevenlabs_stt

Crates.io Docs.rs License

A type-safe, async Rust client for the ElevenLabs Speech-to-Text API. Transcribe audio and videos to text with a simple, ergonomic API.

Features

  • Type-safe & Async: Built with Rust's type system and async/await support
  • Builder Pattern: Intuitive, chainable API for configuring STT requests
  • Model Support: Full support for ElevenLabs models (models::elevenlabs_models::*)
  • Customizable: Elevanlabs STT APIs, custom base URLs, and enterprise support
  • Tokio Ready: Works seamlessly with the Tokio runtime
  • Audio & Video: Works with audios and videos, up to 3.0GB

Check-out Also:

This project is part of a milestone to implement all ElevenLabs APIs in Rust.

Installation

Add this to your Cargo.toml:

[dependencies]
elevenlabs_stt = "0.0.5"

Quick Start

use elevenlabs_stt::{ElevenLabsSTTClient, STTResponse};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = ElevenLabsSTTClient::new("your-api-key");

    let file_path = "inputs/speech.mp3";
    let file_content = std::fs::read(file_path)?;

    let stt_reponse: STTResponse = client.speech_to_text(file_content).execute().await?;

    println!("Results: {:?}", stt_reponse);
    Ok(())
}

Examples

Basic Usage

use elevenlabs_stt::{ElevenLabsSTTClient, STTResponse, models, voices};
use std::env;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let api_key =
        env::var("ELEVENLABS_API_KEY").expect("Please set ELEVENLABS_API_KEY environment variable");

    let client = ElevenLabsSTTClient::new(api_key);

    let file_path = "inputs/speech.mp3";
    let file_content = std::fs::read(file_path)?;

    let stt_reponse: STTResponse = client.speech_to_text(file_content).execute().await?;
    println!("Results: {:?}", stt_reponse);
    Ok(())
}

Advanced Configuration

use elevenlabs_stt::{ElevenLabsSTTClient, STTResponse, models, voices};
use std::env;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let api_key =
        env::var("ELEVENLABS_API_KEY").expect("Please set ELEVENLABS_API_KEY environment variable");

    let client = ElevenLabsSTTClient::new(api_key);

    let file_path = "inputs/speech.mp3";
    let file_content = std::fs::read(file_path)?;

    let stt_reponse: STTResponse = client
        .speech_to_text(file_content)
        .model(models::elevanlabs_models::SCRIBE_V1)
        .language_code("en")
        .tag_audio_events(true)
        .timestamps_granularity("word")
        .diarize(true)
        .diarization_threshold(0.22)
        .webhook(false)
        .webhook(false)
        .temperature(0.2)
        .seed(4000)
        .use_multi_channel(false)
        .execute()
        .await?;

    println!("Results: {:?}", stt_reponse);

    Ok(())
}

Running Examples

# Set your API key
export ELEVENLABS_API_KEY=your_api_key_here

# Run the basic example
cargo run --example basic_stt

# Run the advanced example
cargo run --example advanced_stt

API Overview

Method Description
ElevenLabsSTTClient::new(String) Create client instance (required)*
.speech_to_text(Option<Vec<u8>>) Build a STT request, (File or cloud_storage_url) (required)*
.model(String) Select model (optional)
.language_code(String) Force language pronounce/accent only (no translation) (optional)
.tag_audio_events(bool) Tag audio events like (laughter), (footsteps), etc. (optional)
.num_speakers(u32) The max amount of speakers talking in the uploaded file. (optional)
.timestamps_granularity(String) Allowed values: none, word, character. Defaults to word. (optional)
.diarize(bool) Which speaker is currently talking in the uploaded file. (optional)
.diarization_threshold(f32) Can only be set when diarize=True and num_speakers=None. (optional)
.cloud_storage_url(String) URL of the file to transcribe, if this is None, you must provide file. (optional)
.webhook(bool) Send the transcription result to configured speech-to-text webhooks. (optional)
.webhook_id(String) Optional specific webhook ID to send the transcription result to. (optional)
.temperature(f32) Controls the randomness of the transcription output, between 0.0 and 2.0 (optional)
.seed(u32) Our system will make a best effort to sample deterministically (optional)
.use_multi_channel(bool) Whether the audio file contains multiple channels (optional)
.webhook_metadata(String) Optional metadata to be included in the webhook response (optional)
.execute() Run request → transcribe file (required)*

Error Handling

The crate uses standard Rust error handling patterns. All async methods return Result types:

match client.speech_to_text(file).execute().await {
    Ok(result) => println!("Transcribed text from file: {}", result.text),
    Err(e) => eprintln!("STT transcription failed: {}", e),
}

Requirements

  • Rust 1.70+ (for async/await support)
  • Tokio runtime
  • Valid ElevenLabs API key

License

Licensed under either of:

at your option.

Contributing

Contributions are welcome! Please feel free to:

  • Open issues for bugs or feature requests
  • Submit pull requests with improvements
  • Improve documentation or examples
  • Add tests or benchmarks

Before contributing, please ensure your code follows Rust conventions and includes appropriate tests.

Support

If you like this project, consider supporting me on Patreon 💖

Patreon

Changelog

See CHANGELOG.md for a detailed history of changes.


Note: This crate is not officially affiliated with ElevenLabs. Please refer to the ElevenLabs API documentation for the most up-to-date API information.

Commit count: 8

cargo fmt