Crates.io | mlinrust |
lib.rs | mlinrust |
version | 0.1.0 |
source | src |
created_at | 2022-12-31 12:14:45.472126 |
updated_at | 2022-12-31 12:14:45.472126 |
description | A machine learning library in Rust from scratch. |
homepage | https://github.com/Raibows/MLinRust |
repository | https://github.com/Raibows/MLinRust |
max_upload_size | |
id | 748396 |
size | 163,774 |
Learn the Rust programming language through implementing classic machine learning algorithms. This project is self-completed without relying on any third-party libraries, serving as a bootstrap machine learning library.
❗❗❗:Actively seeking code reviews and welcome suggestions on fixing bugs or code refactoring. Please feel free to share your ideas. Happy to accept advice!
broadcast
, matrix operations
, permute
and etc. in arbitrary dimension. SIMD is used in matrix multiplication thanks to auto vectorizing by Rust.normalize
, shuffle
and Dataloader
. Several popular dataset pre-processing recipes are available.gini
or entropy
are provided.Lasso
, Ridge
and L-inf
)linear(MLP)
and some activation
functions which could be freely stacked and optimized by gradient back propagations.KdTree
and vanilla BruteForceSearch
.Let's use KNN algorithm to solve a classification task. More examples can be found in examples
directory.
create some synthetic data for tests
use std::collections::HashMap;
let features = vec![
vec![0.6, 0.7, 0.8],
vec![0.7, 0.8, 0.9],
vec![0.1, 0.2, 0.3],
];
let labels = vec![0, 0, 1];
// so it is a binary classifiction task, 0 is for the large label, 1 is for the small label
let mut label_map = HashMap::new();
label_map.insert(0, "large".to_string());
label_map.insert(1, "small".to_string());
convert the data to the dataset
use mlinrust::dataset::Dataset;
let dataset = Dataset::new(features, labels, Some(label_map));
split the dataset into train
and valid
sets and normalize them by Standard normalization
let mut temp = dataset.split_dataset(vec![2.0, 1.0], 0); // [2.0, 1.0] is the split fraction, 0 is the seed
let (mut train_dataset, mut valid_dataset) = (temp.remove(0), temp.remove(0));
use mlinrust::dataset::utils::{normalize_dataset, ScalerType};
normalize_dataset(&mut train_dataset, ScalerType::Standard);
normalize_dataset(&mut valid_dataset, ScalerType::Standard);
build and train our KNN model using KdTree
use mlinrust::model::knn::{KNNAlg, KNNModel, KNNWeighting};
// KdTree is one implementation of KNN; 1 defines the k of neighbours; Weighting decides the way of ensemble prediction; train_dataset is for training KNN; Some(2) is the param of minkowski distance
let model = KNNModel::new(KNNAlg::KdTree, 1, Some(KNNWeighting::Distance), train_dataset, Some(2));
evaluate the model
use mlinrust::utils::evaluate;
let (correct, acc) = evaluate(&valid_dataset, &model);
println!("evaluate results\ncorrect {correct} / total {}, acc = {acc:.5}", test_dataset.len());
The rust community. I received many help from rust-lang Discord.
Under GPL-v3 license. And commercial use is strictly prohibited.