kitsune-stt

Crates.io	kitsune-stt
lib.rs	kitsune-stt
version	0.1.0
created_at	2025-10-31 20:35:35.408346+00
updated_at	2025-10-31 20:35:35.408346+00
description	Speech-to-Text tool using Candle and Voxtral
homepage	https://github.com/paazmaya/kitsune-stt
repository
max_upload_size
id	1910729
size	313,271

Juga Paazmaya (paazmaya)

documentation

README

kitsune-stt

An Speech-to-Text implementation in Rust of Voxtral speech recognition using candle.

The model used in the conversion is https://huggingface.co/mistralai/Voxtral-Mini-3B-2507

Features

🎤 Speech-to-Text: Convert audio to text using Voxtral-Mini-3B model
🚀 GPU Acceleration: CUDA and CUDNN support for faster inference
📦 Audio Format and Codec Support: WAV, MP3, FLAC, OGG, M4A, and more, see https://docs.rs/symphonia/latest/symphonia/index.html
⚡ Performance: F16 memory optimization, chunked processing

Quick Start

Installation

# Install Rust (if not already installed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Clone the repository
git clone https://github.com/paazmaya/kitsune-stt.git
cd kitsune-stt

There are two features, cuda and cudnn. They require additional libraries each:

Running

GPU (Recommended):

cargo run --all-features --release -- audio.wav

CPU Only:

cargo run --release -- --cpu audio.wav

Transcribe Audio

The following example commands would create a audio.txt in the same folder as the source audio file.

# Transcribe an audio file
cargo run --release -- --input audio.wav

# Force CPU mode
cargo run --release --features cuda -- --cpu --input audio.wav

Testing

Run the complete test suite:

# All tests
cargo test

# With all features
cargo test --all-features

Code Quality

# Format code
cargo fmt --all

# Run clippy lints
cargo clippy --all-targets --all-features -- -D warnings

# Security audit
cargo install cargo-audit
cargo audit

Development

See CONTRIBUTING.md for detailed development guidelines.

Requirements

Rust 1.70+
pkg-config (Linux)
CUDA toolkit (optional, for GPU)
~7GB disk space (for model files)

License

MIT License - see LICENSE file for details.

Commit count: 0