| Crates.io | sevensense-analysis |
| lib.rs | sevensense-analysis |
| version | 0.1.0 |
| created_at | 2026-01-16 13:41:29.166184+00 |
| updated_at | 2026-01-16 13:41:29.166184+00 |
| description | Analysis bounded context for 7sense bioacoustics platform - clustering, motif detection, sequence analysis |
| homepage | |
| repository | https://github.com/ruvnet/ruvector |
| max_upload_size | |
| id | 2048630 |
| size | 243,231 |
Advanced acoustic analysis algorithms for bioacoustic pattern discovery.
sevensense-analysis provides sophisticated analysis tools for understanding bird vocalizations at scale. From clustering calls into groups, detecting recurring motifs, to modeling temporal patterns with Markov chains, it transforms raw embeddings into actionable ecological insights.
| Use Case | Description | Key Functions |
|---|---|---|
| Call-Type Clustering | Group similar vocalizations | hdbscan_cluster() |
| Sequence Analysis | Model call sequences | MarkovChain::analyze() |
| Motif Discovery | Find repeated patterns | detect_motifs() |
| Diversity Metrics | Shannon/Simpson indices | diversity_index() |
| Periodicity | Detect rhythmic patterns | detect_periodicity() |
| Anomaly Detection | Find unusual calls | anomaly_score() |
Add to your Cargo.toml:
[dependencies]
sevensense-analysis = "0.1"
use sevensense_analysis::{HdbscanClusterer, HdbscanConfig};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Cluster embeddings by call type
let config = HdbscanConfig {
min_cluster_size: 5,
min_samples: 3,
..Default::default()
};
let clusterer = HdbscanClusterer::new(config);
let labels = clusterer.fit(&embeddings)?;
// Count clusters (excluding noise = -1)
let n_clusters = labels.iter().filter(|&&l| l >= 0).max().unwrap_or(&-1) + 1;
println!("Found {} call types", n_clusters);
Ok(())
}
use sevensense_analysis::{HdbscanClusterer, HdbscanConfig};
let config = HdbscanConfig {
min_cluster_size: 5, // Minimum points per cluster
min_samples: 3, // Core point threshold
epsilon: 0.0, // 0 = automatic selection
metric: DistanceMetric::Euclidean,
};
let clusterer = HdbscanClusterer::new(config);
let result = clusterer.fit(&embeddings)?;
println!("Labels: {:?}", result.labels);
println!("Probabilities: {:?}", result.probabilities);
println!("Outlier scores: {:?}", result.outlier_scores);
use sevensense_analysis::{cluster_statistics, ClusterStats};
let stats = cluster_statistics(&embeddings, &labels)?;
for (cluster_id, stat) in stats.iter() {
println!("Cluster {}:", cluster_id);
println!(" Size: {}", stat.size);
println!(" Centroid: {:?}", &stat.centroid[..5]); // First 5 dims
println!(" Intra-cluster distance: {:.3}", stat.intra_distance);
println!(" Silhouette score: {:.3}", stat.silhouette);
}
// Assign new embeddings to existing clusters
let new_embeddings = load_new_data()?;
let assignments = clusterer.predict(&new_embeddings)?;
for (embedding, cluster) in new_embeddings.iter().zip(assignments.iter()) {
if *cluster >= 0 {
println!("Assigned to cluster {}", cluster);
} else {
println!("Classified as noise/outlier");
}
}
use sevensense_analysis::{MarkovChain, MarkovConfig};
// Sequences of cluster labels (call types)
let sequences: Vec<Vec<i32>> = vec![
vec![0, 1, 2, 0, 1], // Sequence 1
vec![0, 1, 0, 2, 1], // Sequence 2
vec![1, 2, 0, 1, 2], // Sequence 3
];
let config = MarkovConfig {
order: 1, // First-order Markov chain
smoothing: 0.01, // Laplace smoothing
};
let chain = MarkovChain::fit(&sequences, config)?;
// Get transition probabilities
let probs = chain.transition_matrix();
println!("P(1|0) = {:.3}", probs[(0, 1)]); // Probability of 1 given 0
// Predict next state
let current_state = 0;
let next_probs = chain.predict_next(current_state)?;
println!("Next state probabilities from state {}:", current_state);
for (state, prob) in next_probs.iter().enumerate() {
println!(" State {}: {:.3}", state, prob);
}
// Generate synthetic sequence
let generated = chain.generate(10, Some(0))?; // 10 states, starting from 0
println!("Generated sequence: {:?}", generated);
use sevensense_analysis::MarkovAnalysis;
let analysis = MarkovAnalysis::new(&chain);
// Stationary distribution
let stationary = analysis.stationary_distribution()?;
println!("Stationary distribution: {:?}", stationary);
// Entropy rate
let entropy = analysis.entropy_rate()?;
println!("Entropy rate: {:.3} bits", entropy);
// Expected hitting times
let hitting_times = analysis.mean_hitting_times()?;
println!("Mean hitting time 0→2: {:.2} steps", hitting_times[(0, 2)]);
use sevensense_analysis::{MotifDetector, MotifConfig};
let config = MotifConfig {
min_length: 3, // Minimum motif length
max_length: 10, // Maximum motif length
similarity_threshold: 0.85,
min_occurrences: 2,
};
let detector = MotifDetector::new(config);
let motifs = detector.detect(&embeddings)?;
for motif in &motifs {
println!("Motif found:");
println!(" Length: {} segments", motif.length);
println!(" Occurrences: {}", motif.occurrences.len());
println!(" Positions: {:?}", motif.positions());
println!(" Average similarity: {:.3}", motif.avg_similarity);
}
use sevensense_analysis::motif_to_sequence;
for motif in motifs.iter().take(5) {
// Get the representative sequence
let sequence = motif_to_sequence(&embeddings, motif)?;
println!("Motif #{} (len={})", motif.id, motif.length);
println!(" Representative: {:?}", sequence);
// Show all occurrences
for (i, occ) in motif.occurrences.iter().enumerate() {
println!(" Occurrence {}: positions {}-{}",
i, occ.start, occ.end);
}
}
// Find motifs that appear across multiple recordings
let recordings: Vec<(RecordingId, Vec<Embedding>)> = load_recordings()?;
let cross_motifs = detector.detect_cross_recording(&recordings)?;
for motif in cross_motifs {
println!("Cross-recording motif:");
println!(" Appears in {} recordings", motif.recording_ids.len());
println!(" Total occurrences: {}", motif.total_occurrences);
}
use sevensense_analysis::{shannon_index, simpson_index, species_richness};
// Count species occurrences
let species_counts = count_species(&labels)?;
let shannon = shannon_index(&species_counts);
let simpson = simpson_index(&species_counts);
let richness = species_richness(&species_counts);
println!("Shannon Index (H'): {:.3}", shannon);
println!("Simpson Index (D): {:.3}", simpson);
println!("Species Richness: {}", richness);
use sevensense_analysis::{sequence_entropy, normalized_entropy};
// Entropy of call sequences
let sequence: Vec<i32> = vec![0, 1, 2, 0, 1, 0, 2, 1, 0];
let entropy = sequence_entropy(&sequence);
let norm_entropy = normalized_entropy(&sequence);
println!("Sequence entropy: {:.3} bits", entropy);
println!("Normalized entropy: {:.3}", norm_entropy); // 0-1 scale
use sevensense_analysis::{detect_periodicity, PeriodicityConfig};
let config = PeriodicityConfig {
min_period: 2,
max_period: 100,
confidence_threshold: 0.7,
};
let timestamps: Vec<f64> = get_call_timestamps()?;
let periods = detect_periodicity(×tamps, config)?;
for (period, confidence) in periods {
println!("Period: {:.1}s (confidence: {:.2})", period, confidence);
}
use sevensense_analysis::{DielAnalyzer, TimeOfDay};
let analyzer = DielAnalyzer::new();
// Analyze activity by time of day
let pattern = analyzer.analyze(×tamps)?;
println!("Dawn chorus: {} calls", pattern.count(TimeOfDay::Dawn));
println!("Morning: {} calls", pattern.count(TimeOfDay::Morning));
println!("Midday: {} calls", pattern.count(TimeOfDay::Midday));
println!("Evening: {} calls", pattern.count(TimeOfDay::Evening));
println!("Night: {} calls", pattern.count(TimeOfDay::Night));
// Peak activity time
let peak = pattern.peak_hour();
println!("Peak activity: {:02}:00", peak);
use sevensense_analysis::{SeasonalAnalyzer, Season};
let analyzer = SeasonalAnalyzer::new();
let trend = analyzer.analyze(&dated_records)?;
println!("Spring activity: {:.1}%", trend.percentage(Season::Spring));
println!("Breeding season peak: {:?}", trend.breeding_peak());
println!("Migration periods: {:?}", trend.migration_windows());
use sevensense_analysis::{TimeSeries, Aggregation};
let series = TimeSeries::from_events(&events)?;
// Aggregate by hour
let hourly = series.aggregate(Aggregation::Hourly)?;
// Detect anomalies
let anomalies = series.detect_anomalies(3.0)?; // 3-sigma threshold
for anomaly in anomalies {
println!("Anomaly at {}: {} calls (expected: {})",
anomaly.timestamp, anomaly.actual, anomaly.expected);
}
| Parameter | Default | Description |
|---|---|---|
min_cluster_size |
5 | Minimum cluster size |
min_samples |
3 | Core point threshold |
epsilon |
0.0 | Distance threshold (0=auto) |
metric |
Euclidean | Distance metric |
| Parameter | Default | Description |
|---|---|---|
order |
1 | Markov chain order |
smoothing |
0.01 | Laplace smoothing factor |
| Algorithm | Complexity | Use Case |
|---|---|---|
| HDBSCAN | O(n log n) | Clustering with noise |
| Markov Chain | O(n × s²) | Sequence modeling |
| Motif Discovery | O(n² × m) | Pattern finding |
| FFT Periodicity | O(n log n) | Rhythm detection |
MIT License - see LICENSE for details.
Part of the 7sense Bioacoustic Intelligence Platform by rUv