| Crates.io | silero-vad-rs |
| lib.rs | silero-vad-rs |
| version | 0.1.2 |
| created_at | 2025-04-03 23:58:24.631305+00 |
| updated_at | 2025-04-04 01:06:06.810362+00 |
| description | Rust implementation of Silero Voice Activity Detection |
| homepage | |
| repository | https://github.com/binarycrayon/silero-vad-rs |
| max_upload_size | |
| id | 1619188 |
| size | 286,672 |
This is a Rust implementation of the Silero Voice Activity Detection (VAD) model. The original model is written in Python and uses PyTorch, while this implementation uses Rust with the ort crate for efficient ONNX model inference.
Add this to your Cargo.toml:
[dependencies]
silero-vad-rs = "0.1.0"
use silero_vad::{SileroVAD, VADIterator};
use silero_vad::utils::{read_audio, save_audio};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load the model
let model = SileroVAD::new("path/to/silero_vad.onnx")?;
// Create a VAD iterator
let mut vad = VADIterator::new(
model,
0.5, // threshold
16000, // sampling rate
100, // min silence duration (ms)
30, // speech pad (ms)
);
// Read audio file
let audio = read_audio("input.wav", 16000)?;
// Get speech timestamps
let timestamps = vad.get_speech_timestamps(
&audio.view(),
250, // min speech duration (ms)
f32::INFINITY, // max speech duration (s)
100, // min silence duration (ms)
30, // speech pad (ms)
)?;
// Process timestamps
for ts in timestamps {
println!("Speech detected from {:.2}s to {:.2}s", ts.start, ts.end);
}
Ok(())
}
use silero_vad::{SileroVAD, VADIterator};
use ndarray::Array1;
fn process_stream() -> Result<(), Box<dyn std::error::Error>> {
let model = SileroVAD::new("path/to/silero_vad.onnx")?;
let mut vad = VADIterator::new(model, 0.5, 16000, 100, 30);
// Process audio chunks
let chunk_size = 512; // for 16kHz
let audio_chunk = Array1::zeros(chunk_size);
if let Some(ts) = vad.process_chunk(&audio_chunk.view())? {
println!("Speech detected from {:.2}s to {:.2}s", ts.start, ts.end);
}
Ok(())
}
use silero_vad::SileroVAD;
use ndarray::{Array1, ArrayView1};
fn process_batch() -> Result<(), Box<dyn std::error::Error>> {
let model = SileroVAD::new("path/to/silero_vad.onnx")?;
// Create a batch of audio chunks
let chunk_size = 512;
let batch_size = 10;
let mut chunks = Vec::with_capacity(batch_size);
for _ in 0..batch_size {
chunks.push(Array1::zeros(chunk_size));
}
// Process the batch
let results = model.process_batch(
&chunks.iter().map(|c| c.view()).collect::<Vec<_>>(),
16000
)?;
// Process results
for (i, prob) in results.iter().enumerate() {
println!("Chunk {}: speech probability = {:.2}", i, prob[0]);
}
Ok(())
}
use silero_vad::utils::{read_audio, save_audio, collect_chunks, drop_chunks};
use silero_vad::{SileroVAD, VADIterator};
fn process_audio() -> Result<(), Box<dyn std::error::Error>> {
// Read audio file
let audio = read_audio("input.wav", 16000)?;
// Detect speech segments
let model = SileroVAD::new("path/to/silero_vad.onnx")?;
let mut vad = VADIterator::new(model, 0.5, 16000, 100, 30);
let timestamps = vad.get_speech_timestamps(
&audio.view(),
250,
f32::INFINITY,
100,
30,
)?;
// Extract speech segments
let speech_only = collect_chunks(×tamps, &audio, 16000)?;
save_audio("speech_only.wav", &speech_only, 16000)?;
// Remove speech segments
let non_speech = drop_chunks(×tamps, &audio, 16000)?;
save_audio("non_speech.wav", &non_speech, 16000)?;
Ok(())
}
You need to download the ONNX model file from the original repository. The model supports both 8kHz and 16kHz audio sampling rates.
en_v6_xlarge.onnx - English model (recommended)ru_v6_xlarge.onnx - Russian modelde_v6_xlarge.onnx - German modeles_v6_xlarge.onnx - Spanish modelThe Rust implementation is designed to be efficient and thread-safe. It uses:
ort for optimized ONNX model inference with GPU supportndarray for efficient array operationsThe library uses ONNX Runtime for GPU acceleration:
cuda feature of the ort crateThe library provides comprehensive error handling through the Error enum:
ModelLoad - Errors during model loadingInvalidInput - Invalid input parameters or dataAudioProcessing - Errors during audio processingIo - File I/O errorsOrt - ONNX Runtime errorsThis project is licensed under the MIT License - see the LICENSE file for details.