sklears-clustering

Crates.iosklears-clustering
lib.rssklears-clustering
version0.1.0-beta.1
created_at2025-10-13 12:53:25.09659+00
updated_at2026-01-01 21:33:01.627802+00
descriptionClustering algorithms for sklears: K-means, DBSCAN, hierarchical clustering
homepagehttps://github.com/cool-japan/sklears
repositoryhttps://github.com/cool-japan/sklears
max_upload_size
id1880487
size1,585,637
KitaSan (cool-japan)

documentation

README

sklears-clustering

Crates.io Documentation License Minimum Rust Version

Clustering algorithms for the sklears machine learning library.

Latest release: 0.1.0-beta.1 (January 1, 2026). See the workspace release notes for highlights and upgrade guidance.

Overview

This crate provides implementations of clustering algorithms including:

  • K-Means: Classic centroid-based clustering with k-means++ initialization
  • Mini-Batch K-Means: Scalable variant for large datasets
  • DBSCAN: Density-based clustering for arbitrary shaped clusters
  • Hierarchical Clustering: Agglomerative clustering with various linkage methods
  • Mean Shift: Mode-seeking clustering algorithm

Usage

[dependencies]
sklears = { version = "0.1.0-beta.1", features = ["clustering"] }

Examples

K-Means Clustering

use sklears::cluster::KMeans;
use sklears::cluster::InitMethod;

let model = KMeans::new(3)
    .init_method(InitMethod::KMeansPlusPlus)
    .max_iter(300)
    .n_init(10)
    .random_state(42);

let fitted = model.fit(&data)?;
let labels = fitted.predict(&new_data)?;
let centers = fitted.cluster_centers();

DBSCAN

use sklears::cluster::DBSCAN;

let model = DBSCAN::new()
    .eps(0.5)
    .min_samples(5)
    .metric(Distance::Euclidean);

let labels = model.fit_predict(&data)?;
// -1 indicates noise points

Performance Features

  • SIMD Distance Calculations: Vectorized distance computations
  • Parallel Assignment: Multi-threaded cluster assignment
  • Efficient K-D Trees: For neighbor searches in DBSCAN
  • Memory-Efficient Updates: In-place operations where possible

GPU-Accelerated Distances

Enable the optional gpu feature to experiment with WebGPU-powered distance kernels. GPU-backed tests are ignored by default because device discovery can be slow; run them explicitly when a compatible GPU is available:

cargo test -p sklears-clustering --features gpu -- --ignored gpu_distances::gpu::tests::test_gpu_distance_computation

Metrics

The crate includes clustering metrics:

  • Silhouette Score
  • Calinski-Harabasz Index
  • Davies-Bouldin Index
  • Inertia

License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.

Commit count: 0

cargo fmt