| Crates.io | livespeech-sdk |
| lib.rs | livespeech-sdk |
| version | 0.1.13 |
| created_at | 2025-12-19 01:39:49.105191+00 |
| updated_at | 2026-01-20 13:03:40.56336+00 |
| description | Real-time speech-to-speech AI conversation SDK |
| homepage | |
| repository | https://github.com/DrawDream-incorporated/LiveSpeechSDK |
| max_upload_size | |
| id | 1993985 |
| size | 129,525 |
A Rust SDK for real-time speech-to-speech AI conversations.
Add to your Cargo.toml:
[dependencies]
livespeech-sdk = "0.1"
tokio = { version = "1.35", features = ["full"] }
use livespeech_sdk::{Config, LiveSpeechClient, LiveSpeechEvent, Region, SessionConfig};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Create client
let config = Config::builder()
.region(Region::ApNortheast2)
.api_key("your-api-key")
.build()?;
let client = LiveSpeechClient::new(config);
// 2. Handle events (only 4 essential events!)
let mut events = client.subscribe();
tokio::spawn(async move {
while let Ok(event) = events.recv().await {
match event {
// Play AI audio
LiveSpeechEvent::Audio(e) => {
audio_player.queue(&e.data); // PCM16 @ 24kHz
}
// User interrupted - CLEAR BUFFER!
LiveSpeechEvent::Interrupted(_) => {
audio_player.clear();
}
// AI finished speaking
LiveSpeechEvent::TurnComplete(_) => {
println!("AI finished");
}
// Handle errors
LiveSpeechEvent::Error(e) => {
eprintln!("Error: {}", e.message);
}
_ => {}
}
}
});
// 3. Connect and start
client.connect().await?;
client.start_session(Some(SessionConfig::new("You are a helpful assistant."))).await?;
// 4. Send audio
client.audio_start().await?;
for chunk in audio_chunks {
client.send_audio_chunk(&chunk).await?; // PCM16 @ 16kHz
}
client.audio_end().await?;
// 5. Cleanup
client.end_session().await?;
client.disconnect().await;
Ok(())
}
Everything you need for basic voice conversations.
| Method | Description |
|---|---|
connect() |
Establish connection |
disconnect() |
Close connection |
start_session(config) |
Start conversation with system prompt |
end_session() |
End conversation |
send_audio_chunk(data) |
Send PCM16 audio (16kHz) |
| Event | Description | Action Required |
|---|---|---|
Audio |
AI's audio output | Play audio (PCM16 @ 24kHz) |
TurnComplete |
AI finished speaking | Ready for next input |
Interrupted |
User barged in | Clear audio buffer! |
Error |
Error occurred | Handle/log error |
InterruptedWhen the user speaks while AI is responding, you must clear your audio buffer:
LiveSpeechEvent::Interrupted(_) => {
audio_player.clear(); // Stop buffered audio immediately
audio_player.stop();
}
Without this, 2-3 seconds of buffered audio continues playing after the user interrupts.
| Direction | Format | Sample Rate |
|---|---|---|
| Input (mic) | PCM16 | 16,000 Hz |
| Output (AI) | PCM16 | 24,000 Hz |
let config = Config::builder()
.region(Region::ApNortheast2) // Required
.api_key("your-api-key") // Required
.build()?;
let session = SessionConfig::new("You are a helpful assistant.")
.with_language("ko-KR"); // Optional: ko-KR, en-US, ja-JP, etc.
Optional features for power users.
| Method | Description |
|---|---|
audio_start() / audio_end() |
Manual audio stream control |
interrupt() |
Explicitly stop AI response (for Stop button) |
send_system_message(text) |
Inject context during conversation |
send_tool_response(id, result) |
Reply to function calls |
update_user_id(user_id) |
Migrate guest to authenticated user |
| Event | Description |
|---|---|
Connected / Disconnected |
Connection lifecycle |
SessionStarted / SessionEnded |
Session lifecycle |
Ready |
Session ready for audio |
UserTranscript |
User's speech transcribed |
Response |
AI's response text |
ToolCall |
AI wants to call a function |
UserIdUpdated |
Guest-to-user migration complete |
For UI "Stop" buttons or programmatic control:
// User clicks Stop button
client.interrupt().await?;
Note: Voice barge-in works automatically via Gemini's VAD. This method is for explicit control.
Inject text context during live sessions (game events, app state, etc.):
// AI responds immediately
client.send_system_message("User completed level 5. Congratulate them!").await?;
// Context only, no response
client.send_system_message_with_options("User is browsing", false).await?;
Requires active live session (
audio_start()called). Max 500 characters.
Let AI call functions in your app:
let tools = vec![Tool {
name: "get_price".to_string(),
description: "Gets product price by ID".to_string(),
parameters: Some(FunctionParameters {
r#type: "OBJECT".to_string(),
properties: serde_json::json!({
"productId": { "type": "string" }
}),
required: vec!["productId".to_string()],
}),
}];
let session = SessionConfig::new("You are helpful.")
.with_tools(tools);
LiveSpeechEvent::ToolCall(e) => {
let result = match e.name.as_str() {
"get_price" => {
let price = lookup_price(&e.args["productId"]);
serde_json::json!({ "price": price })
}
_ => serde_json::json!({ "error": "Unknown" })
};
client.send_tool_response(&e.id, result).await.ok();
}
Enable persistent memory across sessions:
let config = Config::builder()
.region(Region::ApNortheast2)
.api_key("your-api-key")
.user_id("user-123") // Enables memory
.build()?;
| Mode | Memory |
|---|---|
With user_id |
Permanent (entities, summaries) |
Without user_id |
Session only (guest) |
// User logs in during session
client.update_user_id("authenticated-user-123").await?;
// Listen for confirmation
LiveSpeechEvent::UserIdUpdated(e) => {
println!("Migrated {} messages", e.migrated_messages);
}
AI initiates the conversation:
let session = SessionConfig::new("Greet the customer warmly.")
.with_ai_speaks_first(true);
client.start_session(Some(session)).await?;
client.audio_start().await?; // AI speaks immediately
| Option | Default | Description |
|---|---|---|
prePrompt |
- | System prompt |
language |
"en-US" |
Language code |
pipeline_mode |
Live |
Live (~300ms) or Composed (~1-2s) |
ai_speaks_first |
false |
AI initiates (Live mode only) |
allow_harm_category |
false |
Disable safety filters |
tools |
[] |
Function definitions |
use livespeech_sdk::{float32_to_int16, int16_to_bytes, wrap_pcm_in_wav};
let pcm = float32_to_int16(&float_samples);
let bytes = int16_to_bytes(&pcm);
let wav = wrap_pcm_in_wav(&bytes, 16000, 1, 16);
match client.connect().await {
Ok(()) => println!("Connected"),
Err(LiveSpeechError::ConnectionTimeout) => eprintln!("Timed out"),
Err(LiveSpeechError::NotConnected) => eprintln!("Not connected"),
Err(e) => eprintln!("Error: {}", e),
}
| Region | Code |
|---|---|
| Seoul (Korea) | Region::ApNortheast2 |
MIT