Crates.io | kmeans_smid |
lib.rs | kmeans_smid |
version | 0.3.0 |
source | src |
created_at | 2024-06-13 03:20:34.640629 |
updated_at | 2024-06-13 08:44:34.749472 |
description | Small and fast library for k-means clustering calculations. Fixing smid from `kmeans-rs`. |
homepage | |
repository | https://github.com/wzh4464/kmean-smid-rs |
max_upload_size | |
id | 1270200 |
size | 106,731 |
kmeans_smid is a small and fast library for k-means clustering calculations. It fixes smid problem from kmeans crate. Here is a small example, using kmean++ as initialization method and lloyd as k-means variant:
use kmeans_smid::*;
fn main() {
let (sample_cnt, sample_dims, k, max_iter) = (20000, 200, 4, 100);
// Generate some random data
let mut samples = vec![0.0f64;sample_cnt * sample_dims];
samples.iter_mut().for_each(|v| *v = rand::random());
// Calculate kmeans, using kmean++ as initialization-method
let kmean = KMeans<f64, 8>::new(samples, sample_cnt, sample_dims);
let result = kmean.kmeans_lloyd(k, max_iter, KMeans::init_kmeanplusplus, &KMeansConfig::default());
println!("Centroids: {:?}", result.centroids);
println!("Cluster-Assignments: {:?}", result.assignments);
println!("Error: {}", result.distsum);
}
For performance-reasons, all calculations are done on bare vectors, using hand-written SIMD intrinsics from the packed_simd
crate. All vectors are stored row-major, so each sample is stored in a consecutive block of memory.