kitsune-stt

Crates.iokitsune-stt
lib.rskitsune-stt
version0.1.0
created_at2025-10-31 20:35:35.408346+00
updated_at2025-10-31 20:35:35.408346+00
descriptionSpeech-to-Text tool using Candle and Voxtral
homepagehttps://github.com/paazmaya/kitsune-stt
repository
max_upload_size
id1910729
size313,271
Juga Paazmaya (paazmaya)

documentation

README

kitsune-stt

CI

An Speech-to-Text implementation in Rust of Voxtral speech recognition using candle.

Fox speaking to microphone and writing papers

The model used in the conversion is https://huggingface.co/mistralai/Voxtral-Mini-3B-2507

Features

  • 🎤 Speech-to-Text: Convert audio to text using Voxtral-Mini-3B model
  • 🚀 GPU Acceleration: CUDA and CUDNN support for faster inference
  • 📦 Audio Format and Codec Support: WAV, MP3, FLAC, OGG, M4A, and more, see https://docs.rs/symphonia/latest/symphonia/index.html
  • âš¡ Performance: F16 memory optimization, chunked processing

Quick Start

Installation

# Install Rust (if not already installed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Clone the repository
git clone https://github.com/paazmaya/kitsune-stt.git
cd kitsune-stt

There are two features, cuda and cudnn. They require additional libraries each:

Running

GPU (Recommended):

cargo run --all-features --release -- audio.wav

CPU Only:

cargo run --release -- --cpu audio.wav

Transcribe Audio

The following example commands would create a audio.txt in the same folder as the source audio file.

# Transcribe an audio file
cargo run --release -- --input audio.wav

# Force CPU mode
cargo run --release --features cuda -- --cpu --input audio.wav

Testing

Run the complete test suite:

# All tests
cargo test

# With all features
cargo test --all-features

Code Quality

# Format code
cargo fmt --all

# Run clippy lints
cargo clippy --all-targets --all-features -- -D warnings

# Security audit
cargo install cargo-audit
cargo audit

Development

See CONTRIBUTING.md for detailed development guidelines.

Requirements

  • Rust 1.70+
  • pkg-config (Linux)
  • CUDA toolkit (optional, for GPU)
  • ~7GB disk space (for model files)

License

MIT License - see LICENSE file for details.

Commit count: 0

cargo fmt