ngboost-rs

Crates.io	ngboost-rs
lib.rs	ngboost-rs
version	1.1.1
created_at	2026-01-15 23:39:31.012044+00
updated_at	2026-01-19 10:07:11.942503+00
description	Natural Gradient Boosting for Probabilistic Prediction - A Rust implementation of NGBoost
homepage
repository	https://github.com/agene0001/ngboost-rs
max_upload_size
id	2047342
size	648,242

PeppaPig (agene0001)

documentation

https://docs.rs/ngboost-rs

README

ngboost-rs

A Rust implementation of NGBoost (Natural Gradient Boosting for Probabilistic Prediction).

NGBoost is a modular boosting algorithm that allows you to obtain full probability distributions for your predictions, not just point estimates. This enables uncertainty quantification, prediction intervals, and probabilistic forecasting.

Features

Probabilistic Predictions: Get full probability distributions, not just point estimates
Multiple Distributions: Support for Normal, Poisson, Gamma, Exponential, Laplace, Weibull, and more
Classification Support: Bernoulli and multi-class Categorical distributions
Flexible Scoring Rules: LogScore and CRPScore implementations
Natural Gradient Boosting: Uses the natural gradient for efficient optimization on probability distribution manifolds
Generic Design: Easily extensible with custom distributions and base learners

Installation

⚠️ Important: No BLAS backend is enabled by default.

This library relies on a BLAS/LAPACK backend for matrix operations. To ensure cross-platform compatibility (e.g., macOS vs Windows), no backend is selected by default. You must explicitly enable one of the features below in your Cargo.toml, otherwise the project will fail to link.

Add this to your Cargo.toml:
```
[dependencies]
# Replace "openblas" with "accelerate" (macOS) or "intel-mkl" (Windows/Linux) as needed
ngboost-rs = { version = "0.1", features = ["openblas"] }
ndarray = "0.15"
```
Platform-Specific BLAS Backend

This library requires a BLAS/LAPACK backend. Choose one based on your platform:

Add this to your Cargo.toml:

[dependencies]
ngboost-rs = { version = "0.1", features = ["openblas"] }
ndarray = "0.15"

Platform-Specific BLAS Backend

This library requires a BLAS/LAPACK backend. Choose one based on your platform:

macOS (Accelerate - recommended)

[dependencies]
ngboost-rs = { version = "0.1", features = ["accelerate"] }

Linux (OpenBLAS)

First install OpenBLAS:

# Ubuntu/Debian
sudo apt-get install libopenblas-dev

# Fedora
sudo dnf install openblas-devel

# Arch
sudo pacman -S openblas

Then in Cargo.toml:

[dependencies]
ngboost-rs = { version = "0.1", features = ["openblas"] }

Windows (OpenBLAS or Intel MKL)

For Windows, you can use Intel MKL or build OpenBLAS from source.

Using Intel MKL (easier on Windows):

[dependencies]
ngboost-rs = { version = "0.1", features = ["intel-mkl"] }

Intel MKL (any platform - high performance)

[dependencies]
ngboost-rs = { version = "0.1", features = ["intel-mkl"] }

Note: Intel MKL requires the MKL libraries to be installed.

Quick Start

Regression Example

use ndarray::{Array1, Array2};
use ngboost_rs::dist::Normal;
use ngboost_rs::learners::StumpLearner;
use ngboost_rs::ngboost::NGBoost;
use ngboost_rs::scores::LogScore;

fn main() {
    // Your training data
    let x_train: Array2<f64> = /* your features */;
    let y_train: Array1<f64> = /* your targets */;

    // Create and train the model
    let mut model: NGBoost<Normal, LogScore, StumpLearner> = 
        NGBoost::new(100, 0.1, StumpLearner);
    
    model.fit(&x_train, &y_train).expect("Failed to fit");

    // Make point predictions
    let predictions = model.predict(&x_test);

    // Get full predicted distributions (with uncertainty)
    let pred_dist = model.pred_dist(&x_test);
    println!("Predicted mean: {:?}", pred_dist.loc);
    println!("Predicted std:  {:?}", pred_dist.scale);
}

Classification Example

use ngboost_rs::dist::Bernoulli;
use ngboost_rs::dist::ClassificationDistn;

// Binary classification
let mut model: NGBoost<Bernoulli, LogScore, StumpLearner> = 
    NGBoost::new(50, 0.1, StumpLearner);

model.fit(&x_train, &y_train).expect("Failed to fit");

// Get class predictions
let predictions = model.predict(&x_test);

// Get class probabilities
let pred_dist = model.pred_dist(&x_test);
let probabilities = pred_dist.class_probs();  // Shape: (n_samples, n_classes)

Available Distributions

Regression Distributions

Distribution	Parameters	Use Case
`Normal`	loc, scale	General continuous data
`NormalFixedVar`	loc	When variance is known/fixed
`NormalFixedMean`	scale	When mean is known/fixed
`LogNormal`	loc, scale	Positive, right-skewed data
`Exponential`	scale	Waiting times, survival
`Gamma`	shape, rate	Positive continuous data
`Poisson`	rate	Count data
`Laplace`	loc, scale	Heavy-tailed data
`Weibull`	shape, scale	Survival analysis
`HalfNormal`	scale	Positive data near zero
`StudentT`	loc, scale, df	Heavy tails, robust
`TFixedDf`	loc, scale	T with fixed df=3
`Cauchy`	loc, scale	Very heavy tails

Classification Distributions

Distribution	Parameters	Use Case
`Bernoulli`	1 logit	Binary classification
`Categorical<K>`	K-1 logits	K-class classification
`Categorical3`	2 logits	3-class classification
`Categorical10`	9 logits	10-class (e.g., digits)

Multivariate Distributions

Distribution	Parameters	Use Case
`MultivariateNormal<P>`	P*(P+3)/2	Multi-output regression

Scoring Rules

NGBoost supports different scoring rules for training:

LogScore: Negative log-likelihood (default, most common)
CRPScore: Continuous Ranked Probability Score (proper scoring rule)

use ngboost_rs::scores::{LogScore, CRPScore};

// Using LogScore (default)
let model: NGBoost<Normal, LogScore, StumpLearner> = NGBoost::new(100, 0.1, StumpLearner);

// Using CRPScore
let model: NGBoost<Normal, CRPScore, StumpLearner> = NGBoost::new(100, 0.1, StumpLearner);

Uncertainty Quantification

One of the key advantages of NGBoost is uncertainty estimation:

let pred_dist = model.pred_dist(&x_test);

// For Normal distribution
for i in 0..n_samples {
    let mean = pred_dist.loc[i];
    let std = pred_dist.scale[i];
    
    // 95% confidence interval
    let ci_lower = mean - 1.96 * std;
    let ci_upper = mean + 1.96 * std;
    
    println!("Prediction: {:.2} [{:.2}, {:.2}]", mean, ci_lower, ci_upper);
}

Examples

Run the examples to see NGBoost in action:

# Basic regression
cargo run --example regression

# Binary classification
cargo run --example classification

# Comparing different distributions
cargo run --example distributions

# Uncertainty quantification
cargo run --example uncertainty

API Reference

NGBoost

impl<D, S, B> NGBoost<D, S, B>
where
    D: Distribution + Scorable<S> + Clone,
    S: Score,
    B: BaseLearner + Clone,
{
    /// Create a new NGBoost model
    /// - n_estimators: Number of boosting iterations
    /// - learning_rate: Step size for updates (typically 0.01-0.1)
    /// - base_learner: The base learner to use
    pub fn new(n_estimators: usize, learning_rate: f64, base_learner: B) -> Self;

    /// Fit the model to training data
    pub fn fit(&mut self, x: &Array2<f64>, y: &Array1<f64>) -> Result<(), &'static str>;

    /// Make point predictions
    pub fn predict(&self, x: &Array2<f64>) -> Array1<f64>;

    /// Get predicted probability distributions
    pub fn pred_dist(&self, x: &Array2<f64>) -> D;
}

Distribution Trait

pub trait Distribution: Sized + Clone + Debug {
    /// Create distribution from parameters
    fn from_params(params: &Array2<f64>) -> Self;
    
    /// Fit initial parameters from data
    fn fit(y: &Array1<f64>) -> Array1<f64>;
    
    /// Number of distribution parameters
    fn n_params(&self) -> usize;
    
    /// Point prediction (e.g., mean)
    fn predict(&self) -> Array1<f64>;
}

Performance Tips

Learning Rate: Start with 0.1 and decrease if overfitting
Number of Estimators: More is usually better, but watch for overfitting
Distribution Choice: Match the distribution to your data characteristics
Natural Gradient: Enabled by default, provides faster convergence
Release Mode: Always use cargo build --release for production - it's significantly faster

Building for Performance

For best performance, always compile in release mode:

cargo build --release
cargo run --release --example regression --features accelerate

The release profile includes:

Full optimizations (opt-level = 3)
Link-time optimization (lto = "thin")

Debug builds are intentionally slower but compile faster during development.

Comparison with Python NGBoost

This Rust implementation aims to be compatible with the Python NGBoost library:

Feature	Python	Rust
Core Algorithm	Yes	Yes
Natural Gradient	Yes	Yes
LogScore	Yes	Yes
CRPScore	Yes	Yes (Normal, Laplace)
Regression Distributions	16	16
Classification	Yes	Yes
Survival/Censoring	Yes	Not yet
Scikit-learn Integration	Yes	N/A

Distribution Parity

All Python distributions have been ported:

Normal, NormalFixedVar, NormalFixedMean
LogNormal, Exponential, Gamma, Poisson
Laplace, Weibull, HalfNormal
StudentT, TFixedDf, TFixedDfFixedVar
Cauchy, CauchyFixedVar
Bernoulli, Categorical (k-class)
MultivariateNormal

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

References

Acknowledgments

This is a Rust port of the excellent NGBoost Python library developed by the Stanford ML Group.

Commit count: 21

ngboost-rs

documentation

README

ngboost-rs

Features

Installation

Platform-Specific BLAS Backend

Platform-Specific BLAS Backend

macOS (Accelerate - recommended)

Linux (OpenBLAS)

Windows (OpenBLAS or Intel MKL)

Intel MKL (any platform - high performance)

Quick Start

Regression Example

Classification Example

Available Distributions

Regression Distributions

Classification Distributions

Multivariate Distributions

Scoring Rules

Uncertainty Quantification

Examples

API Reference

NGBoost

Distribution Trait

Performance Tips

Building for Performance

Comparison with Python NGBoost

Distribution Parity

Contributing

License

References

Acknowledgments

cargo fmt