| Crates.io | kmeans_uni |
| lib.rs | kmeans_uni |
| version | 0.1.0 |
| created_at | 2025-12-08 08:28:06.010678+00 |
| updated_at | 2025-12-08 08:28:06.010678+00 |
| description | Fast, safe K-Means++ with SIMD acceleration, mini-batch training and WASM support. |
| homepage | https://github.com/Deniskore |
| repository | https://github.com/Deniskore/kmeans_uni |
| max_upload_size | |
| id | 1972953 |
| size | 195,154 |
Fast, safe K-Means++ for CPU-only workloads with optional SIMD acceleration. Supports Euclidean distance and dot-product scoring, provides both classic Lloyd iterations and a mini-batch variant, and includes parity tests against linfa-clustering to guard correctness. Benchmarks show significantly faster training and prediction than linfa on the same CPU. The crate builds on stable Rust.
#![forbid(unsafe_code)]) with a small dependency set.linfa-clustering in AArch64/x86_64 benches for training and prediction.wide feature) and WebAssembly support (see WASM.md).use kmeans_uni::KMeansBuilder;
const N_COLS: usize = 2;
let data: Vec<f32> = vec![
1.0, 1.0,
1.2, 0.9,
-1.0, -1.1,
-1.2, -0.8,
];
match KMeansBuilder::new(2)
.iterations(100)
.cpu_simd() // requires default "wide" feature
.euclidean()
.build()
.fit(&data, N_COLS)
{
Ok(model) => match model.predict(&data) {
Ok(labels) => println!("labels: {labels:?}"),
Err(err) => eprintln!("prediction failed: {err}"),
},
Err(err) => eprintln!("training failed: {err}"),
}
Mini-batch training for large datasets:
use kmeans_uni::KMeansBuilder;
match KMeansBuilder::new(8)
.iterations(50) // iterations = number of batches
.cpu_scalar()
.euclidean()
.mini_batch_rel_tolerance(0.0)
.mini_batch_patience(0)
.build()
.fit_mini_batch_from_source(
&kmeans_uni::SlicePointSource::new(&data, N_COLS).unwrap(),
256, // batch size
)
{
Ok(model) => println!("centroids: {:?}", model.centroids),
Err(err) => eprintln!("mini-batch training failed: {err}"),
}
wide (default): enables SIMD CPU backend for K-Means (f32 and f64) via the wide crate.std::simd stabilizes, you can expect to squeeze more performance from
portable SIMD without depending on wide.serde: derive Serialize/Deserialize on public types.wasm: build with a wasm-friendly configuration (sequential execution, no Rayon). See WASM.md for a browser demo and build steps.--no-default-features) to force scalar-only code paths.Licensed under either of:
at your option.