spectrograms

Crates.iospectrograms
lib.rsspectrograms
version0.1.0
created_at2026-01-23 14:18:31.714258+00
updated_at2026-01-23 14:18:31.714258+00
descriptionA focused Rust library for computing spectrograms with a simple, unified API
homepage
repositoryhttps://github.com/jmg049/Spectrograms
max_upload_size
id2064637
size8,531,785
Jack Geraghty (jmg049)

documentation

README

Spectrograms

ust High-performance spectrogram computation with Rust and Python bindings.

Features

  • Multiple Frequency Scales: Linear, Mel, ERB, and CQT
  • Multiple Amplitude Scales: Power, Magnitude, and Decibels
  • Advanced Audio Features: MFCC, Chromagram, and raw STFT
  • Plan-Based Computation: Reuse FFT plans for 2-10x speedup on batch processing
  • Two FFT Backends: FFTW (fastest) or pure-Rust RealFFT
  • Streaming Support: Frame-by-frame processing for real-time applications
  • Type-Safe Rust API: Compile-time guarantees for spectrogram types
  • Python Bindings: Fast computation with NumPy integration and GIL-free execution

Why Choose Spectrograms?

  • Cross-Language: Use from Rust or Python with consistent APIs
  • High Performance: Rust implementation, Python bindings with minimal overhead
  • Not Limited to One Type: Multiple frequency scales in a unified API
  • Production Ready: Efficient batch processing and streaming support
  • Well Documented: Comprehensive integration guide, examples, and API docs

Installation

Rust Python
[dependencies]
spectrograms = "0.1"

For pure-Rust FFT (no system dependencies):

[dependencies]
spectrograms = {
  version = "0.1",
  default-features = false,
  features = ["realfft"]
}
pip install spectrograms

For FFTW-accelerated version (requires system FFTW library):

pip install spectrograms-fftw

Quick Start

Generate a Test Signal

Rust Python
use std::f64::consts::PI;

// 1 second of 440 Hz sine wave
let sample_rate = 16000.0;
let samples: Vec<f64> = (0..16000)
    .map(|i| {
        let t = i as f64 / sample_rate;
        (2.0 * PI * 440.0 * t).sin()
    })
    .collect();
import numpy as np

# 1 second of 440 Hz sine wave
sample_rate = 16000
t = np.linspace(0, 1, sample_rate, dtype=np.float64)
samples = np.sin(2 * np.pi * 440 * t)

Compute a Basic Spectrogram

Rust Python
use spectrograms::*;

// Configure parameters
let stft = StftParams::new(
    512,                 // FFT size
    256,                 // hop size
    WindowType::Hanning, // window
    true                 // centre frames
)?;

let params = SpectrogramParams::new(
    stft,
    sample_rate
)?;

// Compute power spectrogram
let spec = LinearPowerSpectrogram::compute(
    &samples,
    &params,
    None
)?;

println!("Shape: {} bins × {} frames",
    spec.n_bins(), spec.n_frames());
import spectrograms as sg

# Configure parameters
stft = sg.StftParams(
    n_fft=512,
    hop_size=256,
    window=sg.WindowType.hanning(),
    centre=True
)

params = sg.SpectrogramParams(
    stft,
    sample_rate=sample_rate
)

# Compute power spectrogram
spec = sg.compute_linear_power_spectrogram(
    samples,
    params
)

print(f"Shape: {spec.n_bins} bins × {spec.n_frames} frames")

Mel Spectrogram Example

Rust Python
use spectrograms::*;

let stft = StftParams::new(512, 256, WindowType::Hanning, true)?;
let params = SpectrogramParams::new(stft, 16000.0)?;

// Mel filterbank
let mel = MelParams::new(
    80,      // n_mels
    0.0,     // f_min
    8000.0   // f_max
)?;

// dB scaling
let db = LogParams::new(-80.0)?;

// Compute mel spectrogram in dB
let spec = MelDbSpectrogram::compute(
    &samples, &params, &mel, Some(&db)
)?;

// Access data
println!("Mel bands: {}", spec.n_bins());
println!("Frames: {}", spec.n_frames());
println!("Frequency range: {:?}",
    spec.axes().frequency_range());
import spectrograms as sg

stft = sg.StftParams(512, 256, sg.WindowType.hanning(), True)
params = sg.SpectrogramParams(stft, 16000)

# Mel filterbank
mel = sg.MelParams(
    n_mels=80,
    f_min=0.0,
    f_max=8000.0
)

# dB scaling
db = sg.LogParams(floor_db=-80.0)

# Compute mel spectrogram in dB
spec = sg.compute_mel_db_spectrogram(
    samples, params, mel, db
)

# Access data
print(f"Mel bands: {spec.n_bins}")
print(f"Frames: {spec.n_frames}")
print(f"Frequency range: {spec.frequency_range()}")

Efficient Batch Processing

Reuse FFT plans for 2-10x speedup when processing multiple signals:

Rust Python
use spectrograms::*;

let signals = vec![
    vec![0.0; 16000],
    vec![0.0; 16000],
    vec![0.0; 16000],
];

let stft = StftParams::new(512, 256, WindowType::Hanning, true)?;
let params = SpectrogramParams::new(stft, 16000.0)?;
let mel = MelParams::new(80, 0.0, 8000.0)?;
let db = LogParams::new(-80.0)?;

// Create plan once
let planner = SpectrogramPlanner::new();
let mut plan = planner.mel_db_plan(
    &params, &mel, Some(&db)
)?;

// Reuse for all signals (much faster!)
for signal in signals {
    let spec = plan.compute(&signal)?;
    // Process spec...
}
import spectrograms as sg
import numpy as np

signals = [
    np.random.randn(16000),
    np.random.randn(16000),
    np.random.randn(16000),
]

stft = sg.StftParams(512, 256, sg.WindowType.hanning(), True)
params = sg.SpectrogramParams(stft, 16000)
mel = sg.MelParams(80, 0.0, 8000.0)
db = sg.LogParams(-80.0)

# Create plan once
planner = sg.SpectrogramPlanner()
plan = planner.mel_db_plan(params, mel, db)

# Reuse for all signals (much faster!)
for signal in signals:
    spec = plan.compute(signal)
    # Process spec...

Advanced Features

MFCCs (Mel-Frequency Cepstral Coefficients)

Rust Python
use spectrograms::*;

let stft = StftParams::new(512, 160, WindowType::Hanning, true)?;
let mfcc_params = MfccParams::new(13)?;

let mfccs = compute_mfcc(
    &samples,
    &stft,
    16000.0,
    40,  // n_mels
    &mfcc_params
)?;

// Shape: (13, n_frames)
println!("MFCCs: {} × {}", mfccs.nrows(), mfccs.ncols());
import spectrograms as sg

stft = sg.StftParams(512, 160, sg.WindowType.hanning(), True)
mfcc_params = sg.MfccParams(n_mfcc=13)

mfccs = sg.compute_mfcc(
    samples,
    stft,
    sample_rate=16000,
    n_mels=40,
    mfcc_params=mfcc_params
)

# Shape: (13, n_frames)
print(f"MFCCs: {mfccs.shape}")

Chromagram (Pitch Class Profiles)

Rust Python
use spectrograms::*;

let stft = StftParams::new(4096, 512, WindowType::Hanning, true)?;
let chroma_params = ChromaParams::music_standard();

let chroma = compute_chromagram(
    &samples,
    &stft,
    22050.0,
    &chroma_params
)?;

// Shape: (12, n_frames) - one row per pitch class
println!("Chroma: {} × {}", chroma.nrows(), chroma.ncols());
import spectrograms as sg

stft = sg.StftParams(4096, 512, sg.WindowType.hanning(), True)
chroma_params = sg.ChromaParams.music_standard()

chroma = sg.compute_chromagram(
    samples,
    stft,
    sample_rate=22050,
    chroma_params=chroma_params
)

# Shape: (12, n_frames)
print(f"Chroma: {chroma.shape}")

Supported Spectrogram Types

Frequency Scales

  • Linear (LinearHz): Standard FFT bins, evenly spaced in Hz
  • Mel (Mel): Mel-frequency scale, perceptually motivated for speech/audio
  • ERB (Erb): Equivalent Rectangular Bandwidth, models auditory perception
  • CQT: Constant-Q Transform for music analysis
  • Log (LogHz): Logarithmic frequency spacing

Amplitude Scales

Scale Formula Use Case
Power |X|² Energy analysis, ML features
Magnitude |X| Spectral analysis, phase vocoder
Decibels 10·log₁₀(power) Visualization, perceptual analysis

Type Aliases (Rust)

// Linear frequency
type LinearPowerSpectrogram = Spectrogram<LinearHz, Power>;
type LinearMagnitudeSpectrogram = Spectrogram<LinearHz, Magnitude>;
type LinearDbSpectrogram = Spectrogram<LinearHz, Decibels>;

// Mel frequency
type MelPowerSpectrogram = Spectrogram<Mel, Power>;
type MelMagnitudeSpectrogram = Spectrogram<Mel, Magnitude>;
type MelDbSpectrogram = Spectrogram<Mel, Decibels>;

// ERB frequency
type ErbPowerSpectrogram = Spectrogram<Erb, Power>;
type ErbMagnitudeSpectrogram = Spectrogram<Erb, Magnitude>;
type ErbDbSpectrogram = Spectrogram<Erb, Decibels>;

Window Functions

Supported window functions with different frequency/time resolution trade-offs:

  • rectangular: No windowing (best frequency resolution, high leakage)
  • hanning: Good general-purpose window (default)
  • hamming: Similar to Hanning with different coefficients
  • blackman: Low sidelobes, wider main lobe
  • bartlett: Triangular window
  • kaiser=<beta>: Tunable trade-off (β controls shape, e.g., kaiser=5.0)
  • gaussian=<std>: Smooth roll-off (e.g., gaussian=0.4)
Rust Python
// Parse from string
let window: WindowType = "hanning".parse()?;
let kaiser: WindowType = "kaiser=8.0".parse()?;

// Or use constructors
let hann = WindowType::Hanning;
let gauss = WindowType::Gaussian { std: 0.4 };
# Use class methods
window = sg.WindowType.hanning()
kaiser = sg.WindowType.kaiser(beta=8.0)
gauss = sg.WindowType.gaussian(std=0.4)

# Or from string
stft = sg.StftParams(512, 256, "kaiser=8.0", True)

Default Presets

Rust Python
// Speech processing preset
// n_fft=512, hop_size=160
let params = SpectrogramParams::speech_default(16000.0)?;

// Music processing preset
// n_fft=2048, hop_size=512
let params = SpectrogramParams::music_default(44100.0)?;
# Speech processing preset
params = sg.SpectrogramParams.speech_default(sample_rate=16000)

# Music processing preset
params = sg.SpectrogramParams.music_default(sample_rate=44100)

Accessing Results

Rust Python
let spec = LinearPowerSpectrogram::compute(&samples, &params, None)?;

// Dimensions
let n_bins = spec.n_bins();
let n_frames = spec.n_frames();

// Data (ndarray::Array2<f64>)
let data = spec.data();

// Axes
let freqs = spec.axes().frequencies();
let times = spec.axes().times();
let (f_min, f_max) = spec.axes().frequency_range();
let duration = spec.axes().duration();

// Original parameters
let params = spec.params();
spec = sg.compute_linear_power_spectrogram(samples, params)

# Dimensions
n_bins = spec.n_bins
n_frames = spec.n_frames

# Data (numpy array)
data = spec.data  # shape: (n_bins, n_frames)

# Axes
freqs = spec.frequencies
times = spec.times
f_min, f_max = spec.frequency_range()
duration = spec.duration()

# Original parameters
params = spec.params

Examples

Comprehensive examples in both languages:

Rust (examples/):

Python (python/examples/):

Rust Python
cargo run --example basic_linear
cargo run --example mel_spectrogram
python python/examples/basic_linear.py
python python/examples/mel_spectrogram.py

Documentation


Feature Flags (Rust)

The Rust library requires exactly one FFT backend:

  • fftw: Uses FFTW for FFT computation

    • Fastest performance
    • Requires system FFTW library (libfftw3-dev on Ubuntu/Debian)
    • Not pure Rust
  • realfft (default): Pure-Rust FFT implementation

    • No system dependencies
    • Slightly slower than FFTW
    • Works everywhere

Additional flags:

  • python (default): Enables Python bindings
  • serde: Enables serialization support
# Pure Rust, no Python
[dependencies]
spectrograms = { version = "0.1", default-features = false, features = ["realfft"] }

# FFTW backend with Python
[dependencies]
spectrograms = { version = "0.1", default-features = false, features = ["fftw", "python"] }

Performance Tips

  1. Reuse plans: Use SpectrogramPlanner for 2-10x speedup on batch processing
  2. Choose power-of-2 FFT sizes: Best performance (512, 1024, 2048, 4096)
  3. Use FFTW backend: Maximum speed when system dependencies are acceptable
  4. Python GIL: All compute functions release the GIL for parallelism
  5. Streaming: Use frame-by-frame processing for real-time applications

License

This project is licensed under the MIT License - see the LICENSE file for details.


Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.


Citation

If you use this library in academic work, please cite:

@software{spectrograms2025,
  author = {Geraghty, Jack},
  title = {Spectrograms: High-Performance Spectrogram Computation},
  year = {2025},
  url = {https://github.com/jmg049/Spectrograms}
}

Note: This library focuses on spectrogram computation. For complete audio analysis pipelines, combine it with audio I/O libraries like audio_samples and your preferred plotting tools.

Commit count: 2

cargo fmt