| Crates.io | stft-rs |
| lib.rs | stft-rs |
| version | 0.5.1 |
| created_at | 2025-10-26 12:37:55.895346+00 |
| updated_at | 2025-12-21 12:36:12.912584+00 |
| description | Simple, streaming-friendly, no_std compliant STFT implementation with mel spectrogram support |
| homepage | |
| repository | https://github.com/wizenink/stft-rs |
| max_upload_size | |
| id | 1901358 |
| size | 362,334 |
High-quality, streaming-friendly STFT/iSTFT implementation in Rust working with raw slices (&[f32]).
[!CAUTION] This crate is a WIP, expect API changes and breakage until first stable version
rustfft (default): Full-featured for std environments, supports f32/f64 and any FFT sizemicrofft: Lightweight for no_std/embedded, f32 only, power-of-2 sizes up to 4096StftConfigF32, BatchStftF32 for cleaner codeuse stft_rs::prelude::*;
// Quick start with defaults
let config = StftConfig::<f32>::default_4096();
// Or use the builder for custom configuration
let config = StftConfig::<f32>::builder()
.fft_size(4096)
.hop_size(1024)
.build()
.expect("Valid config");
let stft = BatchStft::new(config.clone());
let istft = BatchIstft::new(config);
let signal: Vec<f32> = vec![0.0; 44100];
let spectrum = stft.process(&signal);
// Manipulate spectrum here...
let reconstructed = istft.process(&spectrum);
use stft_rs::prelude::*;
// Create STFT and mel processors
let stft_config = StftConfigF32::default_4096();
let stft = BatchStftF32::new(stft_config);
let mel_config = MelConfigF32::default(); // 80 mel bins, Slaney scale
let mel_proc = BatchMelSpectrogramF32::new(44100.0, 4096, &mel_config);
// Process audio to mel spectrogram
let signal: Vec<f32> = vec![0.0; 44100];
let spectrum = stft.process(&signal);
let mel_spec = mel_proc.process(&spectrum);
// Convert to log-mel (dB scale) and add delta features
let log_mel = mel_spec.to_db(None, None);
let with_deltas = log_mel.with_deltas(Some(2)); // 240 features (80*3)
// See MEL.md for complete documentation
For cleaner code, use type aliases instead of specifying generic types:
use stft_rs::prelude::*;
// Instead of StftConfig::<f32>, use:
let config = StftConfigF32::default_4096();
let stft = BatchStftF32::new(config.clone());
let istft = BatchIstftF32::new(config);
// With builder:
let config = StftConfigBuilderF32::new()
.fft_size(4096)
.hop_size(1024)
.build()
.expect("Valid config");
// Available aliases:
// - StftConfigF32, StftConfigF64, StftConfigBuilderF32, StftConfigBuilderF64
// - BatchStftF32, BatchIstftF32, BatchStftF64, BatchIstftF64
// - StreamingStftF32, StreamingIstftF32, StreamingStftF64, StreamingIstftF64
// - SpectrumF32, SpectrumF64
// - SpectrumFrameF32, SpectrumFrameF64
For convenience, import commonly used types with:
use stft_rs::prelude::*;
This exports:
BatchStft, BatchIstft, StreamingStft, StreamingIstft, StftConfig, StftConfigBuilder, Spectrum, SpectrumFrame, ComplexStftConfigF32/F64, StftConfigBuilderF32/F64, BatchStftF32/F64, BatchIstftF32/F64, StreamingStftF32/F64, StreamingIstftF32/F64, SpectrumF32/F64, SpectrumFrameF32/F64MelConfig, MelSpectrum, BatchMelSpectrogram, StreamingMelSpectrogram, MelScale, MelNorm (+ F32/F64 aliases)ReconstructionMode, WindowType, PadModeapply_padding, interleave, deinterleave, interleave_into, deinterleave_intoBest for: Processing entire files, offline processing, ML training
use stft_rs::prelude::*;
let config = StftConfig::default_4096();
let stft = BatchStft::new(config.clone());
let istft = BatchIstft::new(config);
let spectrum = stft.process(&signal);
let reconstructed = istft.process(&spectrum);
Best for: Real-time audio, low-latency processing, incremental processing
use stft_rs::prelude::*;
let config = StftConfig::default_4096();
let mut stft = StreamingStft::new(config.clone());
let mut istft = StreamingIstft::new(config.clone());
let pad_amount = config.fft_size / 2;
let padded = apply_padding(&signal, pad_amount, PadMode::Reflect);
let mut output = Vec::new();
for chunk in padded.chunks(512) {
let frames = stft.push_samples(chunk);
for frame in frames {
let samples = istft.push_frame(&frame);
output.extend(samples);
}
}
for frame in stft.flush() {
output.extend(istft.push_frame(&frame));
}
output.extend(istft.flush());
// Remove padding: output[pad_amount..pad_amount + signal.len()]
Note on Padding in Streaming Mode:
apply_padding() helper function or implement custom paddingThe library provides three allocation strategies for different performance requirements:
Best for: Prototyping, one-off processing, simplicity
// Each call allocates new Vec for frames/samples
let frames = stft.push_samples(chunk);
let samples = istft.push_frame(&frame);
_into methods)Best for: Repeated processing, reduced allocator pressure
// Reuse outer Vec, but still allocates frame data
let mut frames = Vec::new();
let mut output = Vec::new();
loop {
frames.clear(); // Keeps capacity
stft.push_samples_into(chunk, &mut frames);
for frame in &frames {
istft.push_frame_into(frame, &mut output);
}
}
Batch mode:
let mut spectrum = Spectrum::new(num_frames, freq_bins);
let mut output = Vec::new();
stft.process_into(&signal, &mut spectrum);
istft.process_into(&spectrum, &mut output);
_write methods)Best for: Real-time audio, hard real-time constraints, minimum latency variance
// Pre-allocate frame pool once
let max_frames = (chunk_size + config.hop_size - 1) / config.hop_size + 1;
let mut frame_pool = vec![SpectrumFrame::new(config.freq_bins()); max_frames];
let mut output = Vec::new();
loop {
let mut pool_idx = 0;
stft.push_samples_write(chunk, &mut frame_pool, &mut pool_idx);
for i in 0..pool_idx {
istft.push_frame_into(&frame_pool[i], &mut output);
}
}
Use the builder pattern for a flexible, ergonomic API:
use stft_rs::prelude::*;
// OLA mode with builder
let config = StftConfig::<f32>::builder()
.fft_size(4096)
.hop_size(1024)
.window(WindowType::Hann)
.reconstruction_mode(ReconstructionMode::Ola)
.build()
.expect("Valid configuration");
// WOLA mode with defaults (Hann window, OLA mode)
let config = StftConfig::<f32>::builder()
.fft_size(2048)
.hop_size(512)
.reconstruction_mode(ReconstructionMode::Wola)
.build()
.expect("Valid configuration");
// Type aliases for cleaner code
let config = StftConfigBuilderF32::new()
.fft_size(4096)
.hop_size(1024)
.window(WindowType::Blackman)
.build()
.expect("Valid configuration");
Legacy API (deprecated):
// Old constructor still works but is deprecated
let config = StftConfig::new(
4096,
1024,
WindowType::Hann,
ReconstructionMode::Ola
).expect("Valid configuration");
sum(w)sum(w²)The library provides powerful helpers for frequency domain manipulation:
let mut spectrum = stft.process(&signal);
// Get magnitude and phase
let mag = spectrum.magnitude(frame, bin);
let phase = spectrum.phase(frame, bin);
// Set from magnitude and phase
spectrum.set_magnitude_phase(frame, bin, new_mag, new_phase);
// Get all magnitudes/phases for a frame
let magnitudes = spectrum.frame_magnitudes(frame);
let phases = spectrum.frame_phases(frame);
// Apply gain to frequency range
spectrum.apply_gain(100..200, 0.5); // Attenuate bins 100-200
// Zero out frequency range
spectrum.zero_bins(0..50); // Remove DC and low frequencies
// Custom processing with closure
spectrum.apply(|frame, bin, complex| {
// Return modified complex value
complex * gain_factor
});
let mut spectrum = stft.process(&signal);
// Zero out low frequencies (simple and clean!)
spectrum.zero_bins(0..100);
let filtered = istft.process(&spectrum);
let mut spectrum = stft.process(&signal);
// Apply gain in magnitude/phase domain
for frame in 0..spectrum.num_frames {
for bin in 0..spectrum.freq_bins {
let mag = spectrum.magnitude(frame, bin);
let phase = spectrum.phase(frame, bin);
spectrum.set_magnitude_phase(frame, bin, mag * 0.5, phase);
}
}
let quieter = istft.process(&spectrum);
let mut spectrum = stft.process(&signal);
// Keep only frequencies between 300 Hz and 3000 Hz
let sample_rate = 44100.0;
let freq_resolution = sample_rate / config.fft_size as f32;
let low_bin = (300.0 / freq_resolution) as usize;
let high_bin = (3000.0 / freq_resolution) as usize;
spectrum.zero_bins(0..low_bin);
spectrum.zero_bins(high_bin..spectrum.freq_bins);
let filtered = istft.process(&spectrum);
Process stereo, 5.1, or any channel count. Channels are processed in parallel with rayon (enabled by default):
// Planar: separate Vec per channel
let left = vec![0.0; 44100];
let right = vec![0.0; 44100];
let spectra = stft.process_multichannel(&[left, right]); // Parallel with rayon
// Interleaved: L,R,L,R...
let interleaved = vec![0.0; 88200];
let spectra = stft.process_interleaved(&interleaved, 2);
// Convert between formats
let channels = deinterleave(&interleaved, 2);
let interleaved = interleave(&channels);
See examples/multichannel_stereo.rs and examples/multichannel_midside.rs for more.
stft-rs can run on embedded systems without the standard library! Perfect for audio processing on microcontrollers, DSPs, and bare-metal environments.
[dependencies]
stft-rs = { version = "0.5.0", default-features = false, features = ["microfft-backend"] }
Important notes:
alloc crate)#![no_std]
extern crate alloc;
use alloc::vec::Vec;
use stft_rs::prelude::*;
// Works great on embedded!
let config = StftConfigF32::builder()
.fft_size(2048) // Must be power-of-2
.hop_size(512)
.build()
.expect("Valid config");
let stft = BatchStftF32::new(config.clone());
let istft = BatchIstftF32::new(config);
let signal: Vec<f32> = Vec::from_slice(&audio_buffer);
let spectrum = stft.process(&signal);
let reconstructed = istft.process(&spectrum);
std (default): Standard library support with rustfft backendrustfft-backend: Use rustfft for FFT (supports f32/f64, any size)microfft-backend: Use microfft for no_std (f32 only, power-of-2 sizes)rayon: Enable parallel multi-channel processing (requires std)Note: You cannot enable both rustfft-backend and microfft-backend at the same time.
fft_size - hop_size samples of latencyRun the included examples:
# Basic batch processing
cargo run --example basic_usage
# Streaming processing with chunks
cargo run --example streaming_usage
# Spectral manipulation (filtering, time-varying processing)
cargo run --example spectral_processing
# Multi-channel stereo processing
cargo run --example multichannel_stereo
# Mid/side stereo width manipulation
cargo run --example multichannel_midside
# Advanced streaming with buffer reuse patterns
cargo run --example advanced_streaming
# Performance comparison of allocation strategies
cargo run --release --example buffer_reuse
# Type aliases usage demonstration
cargo run --example type_aliases
# Spectral operations (magnitude/phase, filtering)
cargo run --example spectral_operations
Spectrum stores data as [real_all, imag_all] for cache efficiencyX[k,n] = Σ x[n + m] * w[m] * e^(-j2πkm/N)
Where:
x[n]: Input signalw[m]: Window functionN: FFT sizek: Frequency binn: Frame index (hop positions)OLA Mode:
x[n] = Σ IFFT(X[k,m]) / Σ w[n - m*hop]
WOLA Mode:
x[n] = Σ IFFT(X[k,m]) * w[n - m*hop] / Σ w²[n - m*hop]
Run the comprehensive test suite:
cargo test --lib
# With output
cargo test --lib -- --nocapture
Tests verify:
Core dependencies:
num-traits: Generic numeric traits (no_std compatible with libm)FFT backends (mutually exclusive):
rustfft (default): High-performance FFT for std environmentsmicrofft (optional): Lightweight FFT for no_std/embeddedOptional dependencies:
rayon: Parallel multi-channel processing (requires std)[MIT]
Contributions welcome! Areas for improvement: