| Crates.io | light-svm |
| lib.rs | light-svm |
| version | 0.1.0 |
| created_at | 2025-11-12 17:27:46.135902+00 |
| updated_at | 2025-11-12 17:27:46.135902+00 |
| description | Lightweight, fast LinearSVC-style crate with Pegasos/DCD solvers, CSR input, OvR/OvO strategies, and optional Platt calibration. |
| homepage | |
| repository | |
| max_upload_size | |
| id | 1929679 |
| size | 10,740,518 |
A lightweight, LinearSVC-style crate for Rust:
hinge / squared-hingeBinary, OneVsRest, OneVsOneSolver::Auto heuristicuse light_svm::{CsrMatrix, LinearSVC, ClassStrategy, SvmParams, Loss, Solver, PlattCalibrator, DecisionScores};
let x = CsrMatrix::from_dense(&vec![ vec![2.0, 1.0], vec![-1.0, -2.0] ], 0.0);
let y = vec![1, -1];
let params = SvmParams::builder()
.c(1.0)
.loss(Loss::Hinge)
.fit_intercept(true)
.tol(1e-3)
.solver(Solver::Auto)
.build();
let mut svc = LinearSVC::builder()
.class_strategy(ClassStrategy::Binary)
.params(params)
.build();
svc.fit(&x, &y);
// Decision function
match svc.decision_function(&x) {
DecisionScores::Binary { classes, scores } => {
println!("classes={:?}, scores={:?}", classes, &scores[..]);
}
_ => unreachable!(),
}
// Calibrated probabilities (Binary)
let scores = match svc.decision_function(&x) { DecisionScores::Binary{scores, ..} => scores, _ => unreachable!() };
let pos = *y.iter().max().unwrap();
let y01: Vec<u8> = y.iter().map(|&yy| if yy==pos {1} else {0}).collect();
let calib = PlattCalibrator::fit(&scores, &y01);
svc.with_calibration(calib);
let proba = svc.predict_proba(&x); // Vec<[P(neg), P(pos)]>
let mut svc2 = LinearSVC::builder()
.class_strategy(ClassStrategy::OneVsRest)
.c(1.0).loss(Loss::Hinge).fit_intercept(true)
.tol(1e-3).solver(Solver::Auto)
.build();
The decision_function returns a strategy-aligned enumeration:
DecisionScores::Binary { classes: [neg,pos], scores: Vec<f32> }
scores[i] is the raw margin w·x_i + b; positive => pos.DecisionScores::OneVsRest { classes: Vec<i32>, scores: Vec<Vec<f32>> }
scores is shaped rows × classes, aligned to classes.DecisionScores::OneVsOne { pairs: Vec<(i32,i32)>, scores: Vec<Vec<f32>> }
scores is rows × pairs; positive means vote for the first class in the pair.predict_proba (Binary)LinearSVC::predict_proba(&self, x) returns Vec<[P(neg), P(pos)]> for Binary models.PlattCalibrator, attached via svc.with_calibration(calib).svc.predict_proba_with(x, &calib) without storing.[!NOTE] Multiclass probability calibration (OvR/OvO) often uses one-vs-rest Platt or isotonic with normalization (e.g., pairwise coupling). The crate keeps a light stub for Binary; multiclass calibration can be added later.
tol): practical defaultstol controls when DCD stops via the projected-gradient (PG) gap: stop when PGmax - PGmin ≤ tol.1e-2 for quick training / rough models.1e-3 (default) for balanced speed/accuracy.1e-4 for tighter convergence (slower).tol; use max_epochs to trade accuracy vs time..c_by_class(neg, pos) or .c_neg(v), .c_pos(v).class_weight_*.C_-1 = c * class_weight_neg, C_+1 = c * class_weight_pos.C_i exactly; Pegasos scales updates by (C_i / c)..eval_every(k).verbose(true) prints metrics every k passes:
pgmax, pgmin, kkt = pgmax - pgminLinearSvm summary carries kkt_history and gap_history.Solver::Auto picks DCD for sparse, in-memory problems with ≤ 200k features, ≤ 2e8 nonzeros, and density ≤ 1%; otherwise Pegasos.fit_binary_* in solver.rs and routing via Solver.Kernel trait and a KernelSvm model; the LinearSVC API remains.The repo keeps shared CSV fixtures under tests/data/.
If you run the Python sandboxes or the Rust examples with --write-data, they will (re)generate the
expected files in that directory, and the integration tests read from the same location. Make sure the
tests/data folder exists before running the examples:
python sandbox/data_flair.py --write-data # writes tests/data/train.csv + tests/data/test.csv
python sandbox/iris_multiclass.py --write-data # writes tests/data/iris_train.csv + iris_test.csv
Tip: Rayon-powered helpers are enabled by default (disable with
--no-default-features if desired); you can still add
RUSTFLAGS="-C target-feature=+avx2" to squeeze the most out of SIMD-heavy
sections on capable CPUs.