rustreg

Crates.iorustreg
lib.rsrustreg
version0.1.1
created_at2025-06-09 15:21:14.863138+00
updated_at2025-06-09 15:39:54.14157+00
descriptionSimple Rust library for data preprocessing (Normalization and Standartization). The goal for this crate is to become a Rust Machine Learning 101.
homepage
repository
max_upload_size
id1706042
size15,656
PromCompute (promcompute)

documentation

README

📊 rustreg

A lightweight Rust library providing basic statistical operations and CSV parsing. The main goal of the crate is to provide simple and fats way to perform Data Preprocessing (Standardization and Normalization)

📘 API Documentation

struct DataSet

/// A collection of numeric (f64) data, with methods for
/// basic statistical analysis.
pub struct DataSet {
    data: Vec<f64>,
}

impl DataSet

/// Creates a new DataSet from a Vec<f64>.
pub fn new(x: Vec<f64>) -> DataSet

/// Returns the number of elements as f64.
pub fn len(&self) -> f64

/// Computes the mean (average) of the data.
pub fn mean(&self) -> f64

/// Computes the population standard deviation.
pub fn standard_deviation(&self) -> f64

/// Normalizes the data in-place to the range [0, 1] using min-max scaling.
pub fn normalized(&mut self)

/// Standardizes the data in-place to Z-scores (mean=0, std=1).
pub fn standardized(&mut self)

/// Prints each data point on a new line.
pub fn print(&self)

📥 CSV Loading Functions

/// Reads all rows from a CSV file at `path`.
/// Returns `Ok(Vec<Vec<String>>)` or an error.
pub fn read_csv_rows(path: &str) -> Result<Vec<Vec<String>>, Box<dyn Error>>

/// Extracts a numeric column (as Vec<f64>) from CSV rows by `index`.
/// Panics if non-numeric values are encountered.
pub fn red_csv_cols(rows: &[Vec<String>], index: usize) -> Result<Vec<f64>, Box<dyn Error>>

🔧 Example Usage (in main.rs)

use rustreg::{read_csv_rows, red_csv_cols, DataSet};

fn main() {
    println!("Hello, world! 🚀😁🎉\n");

    let rows: Vec<Vec<String>> = read_csv_rows("src/data.csv").unwrap();

    let col1: Vec<f64> = red_csv_cols(&rows, 0).unwrap();
    let _col2: Vec<f64> = red_csv_cols(&rows, 1).unwrap();

    let mut training_data: DataSet = DataSet::new(col1);

    println!("Data Length: {}", training_data.len());
    println!("Mean: {}", training_data.mean());
    println!("Standard deviation: {}", training_data.standard_deviation());

    training_data.standardized();
    println!("\nStandardized:\n");
    training_data.print();
}

📄 data.csv Example

Age,EstimatedSalary,Purchased
19,19000,0
35,20000,0
26,43000,0
27,57000,0
19,76000,0
27,58000,0

📝 Notes

  • read_csv_rows retains headers; skip the first row if needed.
  • red_csv_cols will unwrap() during parsing; use proper error handling for untrusted CSV inputs.
  • normalized() and standardized() mutate the dataset in-place.

Commit count: 0

cargo fmt