birdnet-onnx

Crates.io	birdnet-onnx
lib.rs	birdnet-onnx
version	2.0.0-rc.6
created_at	2026-01-01 15:13:59.53283+00
updated_at	2026-01-24 12:45:55.669082+00
description	Bird species detection using BirdNET and Perch ONNX models
homepage
repository	https://github.com/tphakala/rust-birdnet-onnx
max_upload_size
id	2016328
size	1,089,982

Tomi P. Hakala (tphakala)

documentation

README

birdnet-onnx

A Rust library for running inference on BirdNET and Perch ONNX models with CUDA GPU support.

Features

Support for BirdNET v2.4, v3.0, and Perch v2 models
Automatic model type detection from ONNX tensor shapes
Thread-safe classifier with builder pattern
Top-K predictions with configurable confidence threshold
Batch inference for GPU efficiency
CLI tool for WAV file analysis

Supported Models

Model	Sample Rate	Segment	Embeddings
BirdNET v2.4	48 kHz	3.0s	No
BirdNET v3.0	32 kHz	5.0s	1024-dim
Perch v2	32 kHz	5.0s	Variable

Installation

Add to your Cargo.toml:

[dependencies]
birdnet-onnx = "2.0"

ONNX Runtime Linking

By default, ONNX Runtime is statically linked into your binary using the download-binaries feature. This means:

No external onnxruntime.dll/libonnxruntime.so needed at runtime
No DLL search order issues on Windows
Larger binary size (~50MB additional)

For dynamic linking (loads ONNX Runtime at runtime):

[dependencies]
birdnet-onnx = { version = "2.0", features = ["load-dynamic"] }

With dynamic linking, you must ensure the correct ONNX Runtime library is available:

Set ORT_DYLIB_PATH environment variable to the library path, or
Place the library in the executable's directory, or
Install ONNX Runtime system-wide

CUDA and cuDNN libraries always load dynamically from system paths regardless of linking mode.

Features can be combined: --features cuda,load-dynamic enables CUDA with dynamic ONNX Runtime loading.

Library Usage

use birdnet_onnx::{Classifier, InferenceOptions, Result};

fn main() -> Result<()> {
    // Build classifier
    let classifier = Classifier::builder()
        .model_path("birdnet_v24.onnx")
        .labels_path("labels.txt")
        .top_k(5)
        .min_confidence(0.1)
        .build()?;

    // Prepare audio segment (48kHz, 3.0s = 144,000 samples for v2.4)
    let audio: Vec<f32> = load_audio_segment();

    // Run inference
    let result = classifier.predict(&audio, &InferenceOptions::default())?;

    for pred in &result.predictions {
        println!("{}: {:.1}%", pred.species, pred.confidence * 100.0);
    }

    Ok(())
}

With CUDA GPU

use birdnet_onnx::Classifier;

let classifier = Classifier::builder()
    .model_path("model.onnx")
    .labels_path("labels.txt")
    .with_cuda()  // Uses safe defaults for memory allocation
    .build()?;

For fine-grained control over CUDA memory allocation:

use birdnet_onnx::{Classifier, CUDAConfig, ArenaExtendStrategy};

let classifier = Classifier::builder()
    .model_path("model.onnx")
    .labels_path("labels.txt")
    .with_cuda_config(
        CUDAConfig::new()
            .with_memory_limit(4 * 1024 * 1024 * 1024)  // 4GB limit
            .with_arena_extend_strategy(ArenaExtendStrategy::SameAsRequested)
    )
    .build()?;

Batch Inference

use birdnet_onnx::InferenceOptions;

let segments: Vec<Vec<f32>> = chunk_audio_file();
let refs: Vec<&[f32]> = segments.iter().map(|s| s.as_slice()).collect();

let results = classifier.predict_batch(&refs, &InferenceOptions::default())?;

GPU Memory-Efficient Batch Processing

For processing many batches on GPU, use BatchInferenceContext to prevent memory growth:

use birdnet_onnx::{Classifier, InferenceOptions};

let classifier = Classifier::builder()
    .model_path("model.onnx")
    .labels_path("labels.txt")
    .with_cuda()
    .build()?;

// Create context with pre-allocated buffers (max 32 segments per batch)
let mut ctx = classifier.create_batch_context(32)?;

// Process multiple batches - memory is reused across calls
for chunk in audio_segments.chunks(32) {
    let refs: Vec<&[f32]> = chunk.iter().map(|s| s.as_slice()).collect();
    let results = classifier.predict_batch_with_context(
        &mut ctx,
        &refs,
        &InferenceOptions::default(),
    )?;
}

Timeout and Cancellation

use birdnet_onnx::{InferenceOptions, CancellationToken};
use std::time::Duration;

// With timeout
let options = InferenceOptions::timeout(Duration::from_secs(30));
let result = classifier.predict(&audio, &options)?;

// With cancellation token (for graceful shutdown)
let token = CancellationToken::new();
let options = InferenceOptions::new().with_cancellation_token(token.clone());

// Cancel from another thread
token.cancel();

Execution Provider Query

Query which execution provider was requested:

let classifier = Classifier::builder()
    .model_path("model.onnx")
    .labels_path("labels.txt")
    .with_cuda()
    .build()?;

// Returns the requested provider (not necessarily the active one)
println!("Requested: {}", classifier.requested_provider().as_str());

Note: This returns the requested execution provider. If the requested provider is unavailable, ONNX Runtime silently falls back to CPU. To verify which provider is actually running:

Enable verbose logging: export ORT_LOG_LEVEL=Verbose
Check log output for "Using [provider]" messages

CLI Usage

A basic CLI tool is included for quick testing of the library. It is not intended for production analysis tasks.

Build:

cargo build --release --bin birdnet-analyze

Analyze a WAV file:

birdnet-analyze recording.wav -m birdnet_v24.onnx -l labels.txt

With options:

birdnet-analyze recording.wav \
    -m birdnet_v24.onnx \
    -l labels.txt \
    -o 1.5 \              # 1.5s overlap between segments
    -k 5 \                # Top 5 predictions
    --min-confidence 0.2 \
    --batch-size 32 \     # Segments per batch
    --timeout 30 \        # Per-batch timeout in seconds
    -v                    # Verbose output (shows timing, memory usage)

With GPU acceleration:

birdnet-analyze recording.wav -m model.onnx -l labels.txt --cuda
birdnet-analyze recording.wav -m model.onnx -l labels.txt --tensorrt

Example output:

Analyzing: recording.wav (3m 21s, 48000 Hz)
Model: BirdNET v2.4 (3.0s segments, 1.5s overlap)
Provider: CUDA

00:00.0  Eurasian Pygmy-Owl (92.4%)
00:01.5  Eurasian Pygmy-Owl (97.8%)
00:03.0  Eurasian Pygmy-Owl (98.5%)
...

134 segments analyzed in 1.2s

Range Filter (Meta Model)

Filter species predictions by location and date using BirdNET's meta model:

use birdnet_onnx::RangeFilter;

// Load the meta model
let range_filter = RangeFilter::builder()
    .model_path("birdnet_data_model.onnx")
    .labels(labels)
    .threshold(0.01)
    .build()?;

// Get species likely at location/date
// Helsinki, Finland on June 15th
let scores = range_filter.predict(60.1695, 24.9354, 6, 15)?;

println!("Expected {} species", scores.len());
for score in scores.iter().take(10) {
    println!("{}: {:.1}%", score.species, score.score * 100.0);
}

The meta model uses BirdNET's 48-week calendar (4 weeks per month).

Using RangeFilter for Location-Based Filtering

use birdnet_onnx::{Classifier, RangeFilter};

// Build classifier
let classifier = Classifier::builder()
    .model_path("birdnet.onnx")
    .labels_path("labels.txt")
    .build()?;

// Build range filter using classifier labels
let range_filter = RangeFilter::builder()
    .model_path("birdnet_data_model.onnx")
    .from_classifier_labels(classifier.labels())
    .threshold(0.01)
    .build()?;

// Get predictions
let result = classifier.predict(&audio_segment)?;

// Filter by location (Helsinki, June 15th)
let location_scores = range_filter.predict(60.1695, 24.9354, 6, 15)?;
let filtered = range_filter.filter_predictions(
    &result.predictions,
    &location_scores,
    false,
);

for pred in filtered {
    println!("{}: {:.1}%", pred.species, pred.confidence * 100.0);
}

Batch processing multiple files:

// Calculate location scores once
let location_scores = range_filter.predict(lat, lon, month, day)?;

// Process multiple audio segments from same location
let mut predictions_batch = Vec::new();
for segment in audio_segments {
    let result = classifier.predict(&segment)?;
    predictions_batch.push(result.predictions);
}

// Filter all predictions at once
let filtered_batch = range_filter.filter_batch_predictions(
    predictions_batch,
    &location_scores,
    true, // rerank: multiply confidence by location score
);

Development

Requires Task runner:

task --list        # Show available tasks
task build         # Build in debug mode
task build:release # Build in release mode
task test          # Run unit tests
task lint          # Run clippy
task ci            # Run all CI checks

Acknowledgments

This library provides Rust bindings for running inference on models from these projects:

BirdNET-Analyzer - Bird sound identification by the K. Lisa Yang Center for Conservation Bioacoustics at the Cornell Lab of Ornithology and Chemnitz University of Technology
Perch - Bioacoustics research by Google Research

Built with:

ONNX Runtime - Cross-platform inference engine by Microsoft
ort - Rust bindings for ONNX Runtime

License

MIT

Commit count: 118

birdnet-onnx

documentation

README

birdnet-onnx

Features

Supported Models

Installation

ONNX Runtime Linking

Library Usage

With CUDA GPU

Batch Inference

GPU Memory-Efficient Batch Processing

Timeout and Cancellation

Execution Provider Query

CLI Usage

Range Filter (Meta Model)

Using RangeFilter for Location-Based Filtering

Development

Acknowledgments

License

cargo fmt