Crates.io | treerustler |
lib.rs | treerustler |
version | 0.1.0 |
source | src |
created_at | 2023-07-04 22:47:32.802723 |
updated_at | 2023-07-04 22:47:32.802723 |
description | A library that implement tree models |
homepage | |
repository | |
max_upload_size | |
id | 908369 |
size | 23,211 |
TreeRustler is a simple implementation of a Decision Tree Classifier using the Rust programming language. This project serves as a beginner's exploration of Rust while building a decision tree classifier. The main modules in this project are the data
module and the tree
module.
The data
module provides the Data
struct, which represents the expected data type for the tree implementation. The Data
struct holds the training features required for building the decision tree classifier. This module handles only the necessary data operations for the Decision Tree (i.e. no loading, preprocessing, or splitting yet!).
The tree
module contains the DecisionTreeClassifier
struct. This struct represents the decision tree classifier and provides the following parameters:
max_depth
: Specifies the maximum depth of the decision tree. It controls how deep the tree can grow during training. Setting a smaller value can help prevent overfitting, but too small may result in underfitting.min_samples_split
: Specifies the minimum number of samples required to split an internal node during training. It controls when to stop splitting nodes further. Setting a higher value can prevent overfitting, but too high may result in underfitting.The DecisionTreeClassifier
struct has the following methods:
fit(x: &Data, y: &Vec<u8>)
: Fits the decision tree classifier to the provided training data. This method trains the classifier using the feature from the Data
struct and labels from a different vector.predict_proba(x: &Data) -> Vec<f64>
: Predicts the class probabilities for the provided data using the trained decision tree. It returns a vector of probabilities for each class.To use the TreeRustler project, follow these steps:
git clone https://github.com/EduardoPach/treerustler.git
cd treerustler
Data
struct and your labels to a Vec<u8>
.DecisionTreeClassifier
from the tree
module, specifying the desired max_depth
and min_samples_split
values.fit
method on the classifier instance, passing your training data.predict_proba
method to predict class probabilities for new data points.use treerustler::data::Data;
use treerustler::tree::DecisionTreeClassifier;
fn main() {
// Load your features data and labels
let x: data::Data = data::Data::from_string(&"1 3; 2 3; 3 1; 3 1; 2 3");
let y: Vec<u8> = vec![0, 0, 1, 1, 2];
// Create an instance of DecisionTreeClassifier
let mut classifier = DecisionTreeClassifier::new(max_depth, min_samples_split);
// Fit the classifier to the training data
classifier.fit(&x, &y);
// Predict class probabilities for new data points
let probabilities = classifier.predict_proba(&x);
// Process the predicted probabilities as needed
}