wc_fir

Crates.iowc_fir
lib.rswc_fir
version0.3.1
created_at2025-10-20 22:38:43.441901+00
updated_at2025-10-21 20:56:11.047233+00
descriptionA pure-Rust library for modeling working capital drivers using Finite Impulse Response (FIR) filters, with support for manual profiles and automatic lag selection via OLS and Lasso
homepagehttps://github.com/noahbclarkson/wc_fir
repositoryhttps://github.com/noahbclarkson/wc_fir
max_upload_size
id1892845
size184,736
Noah Clarkson (noahbclarkson)

documentation

README

wc_fir

A pure-Rust library for modeling working capital drivers using Finite Impulse Response (FIR) filters

CI Rust License

Overview

wc_fir is a specialized library for financial modeling that applies Finite Impulse Response (FIR) filters to working capital drivers. It supports two complementary approaches:

  • Manual Mode: Apply user-defined FIR profiles with explicit tap weights and scaling factors
  • Auto Mode (OLS): Automatically estimate FIR taps from historical data using Ordinary Least Squares regression
  • Auto Mode (fit_auto): Build a maximal design, run automatic lag selection (Lasso + TS-CV by default), then refit with OLS for interpretable coefficients and diagnostics

The library is designed for analysts and financial engineers who need to model time-lagged relationships between drivers (e.g., revenue, production volume) and working capital components (e.g., accounts receivable, inventory).

Key Features

  • 🦀 Pure Rust: No external dependencies (BLAS, LAPACK, MKL) required
  • 🌍 Cross-platform: Works seamlessly on Windows, Linux, and macOS
  • 📊 Dual modes: Manual profiles for known relationships, OLS for data-driven discovery
  • 🧠 Automatic lag selection: Lasso with time-series CV by default, plus BIC and correlation-screen strategies
  • 🎯 Flexible: Supports multiple drivers, varying lag lengths, and intercept terms
  • 🔧 Regularization: Built-in ridge regression for handling collinearity
  • Constraints: Optional non-negativity enforcement for interpretable coefficients
  • 📈 Metrics: Automatic RMSE and R² calculation for model quality assessment

Installation

Add wc_fir to your Cargo.toml:

[dependencies]
wc_fir = "0.3.0"

Or use cargo add:

cargo add wc_fir

Requirements

  • Rust 1.70 or newer
  • No system dependencies required

Quick Start

Example 1: Manual Mode

When you know the relationship between drivers and working capital (e.g., from industry standards or prior analysis):

use wc_fir::{manual_apply, ManualProfile};

fn main() {
    // Historical driver series
    let revenue = vec![100.0, 110.0, 120.0, 130.0, 140.0, 150.0];
    let production = vec![80.0, 85.0, 90.0, 95.0, 100.0, 105.0];

    // Define FIR profiles:
    // - Revenue impact: 60% immediate, 30% lag-1, 10% lag-2, scaled by 0.9
    // - Production impact: 50% immediate, 50% lag-1, scaled by 0.4
    let profiles = vec![
        ManualProfile {
            percentages: vec![0.6, 0.3, 0.1],
            scale: 0.9,
        },
        ManualProfile {
            percentages: vec![0.5, 0.5],
            scale: 0.4,
        },
    ];

    // Generate synthetic working capital balance
    let balance = manual_apply(&[revenue, production], &profiles).unwrap();

    println!("Working capital balance: {:?}", balance);
}

Example 2: Auto Mode (OLS)

When you want to discover the relationships from historical data:

use wc_fir::{fit_ols, OlsOptions};

fn main() {
    // Historical drivers
    let revenue = vec![120.0, 125.0, 130.0, 128.0, 140.0, 150.0, 160.0, 170.0];
    let production = vec![80.0, 82.0, 85.0, 90.0, 95.0, 98.0, 100.0, 105.0];

    // Historical working capital (target to fit)
    let actual_wc = vec![95.0, 100.0, 108.0, 115.0, 120.0, 130.0, 140.0, 155.0];

    // Fit OLS model with 3 lags for revenue, 2 lags for production
    let fit = fit_ols(
        &[revenue, production],
        &actual_wc,
        &[3, 2],  // Lag lengths per driver
        &OlsOptions::default(),
    ).unwrap();

    // Inspect results
    println!("Model quality:");
    println!("  RMSE: {:.2}", fit.rmse);
    println!("  R²:   {:.4}", fit.r2);

    println!("\nPer-driver impact:");
    for (i, (scale, percentages)) in fit.per_driver.iter().enumerate() {
        println!("  Driver {}: scale = {:.3}, taps = {:?}", i, scale, percentages);
    }
}

Example 3: Automatic Lag Selection (fit_auto)

When you do not know the lag structure up front, fit_auto builds a maximal design matrix, runs a coordinate-descent Lasso path with rolling cross-validation, prunes using the guardrails, then refits with OLS on the selected columns.

use wc_fir::fit_auto;

fn main() {
    let driver_a = vec![1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7];
    let driver_b = vec![0.8, 0.82, 0.85, 0.9, 0.95, 0.98, 1.0, 1.05];
    let target = vec![0.6, 0.68, 0.74, 0.8, 0.86, 0.92, 0.98, 1.04];

    let fit = fit_auto(&[driver_a, driver_b], &target).unwrap();
    println!("RMSE: {:.4}  R²: {:.4}", fit.rmse, fit.r2);
    for (idx, (scale, taps)) in fit.per_driver.iter().enumerate() {
        println!("Driver {idx}: scale={scale:.3}, taps={taps:?}");
    }
}

Why do some percentage vectors look like [1.0, 0.0]? OlsOptions::default() sets nonnegative = true, so negative tap weights are clipped to zero and the remaining weights are renormalised to sum to one. If you prefer the raw signed coefficients, disable the flag and inspect fit.coeffs.

Choosing the Right Strategy

wc_fir provides three complementary approaches to lag selection:

1. Manual (you pick lag lengths)

fit_ols(&drivers, &target, &[3, 2], &OlsOptions::default())
  • ✅ Full control over lag structure
  • ✅ Contiguous blocks [0..L-1] per driver
  • ✅ Interpretable: percentages show how impact spreads over time
  • ❌ Requires domain knowledge to choose L

Use when: You know the lag structure from industry standards or prior analysis.

2. Sparse Auto (Lasso, BIC, Screening)

fit_auto(&drivers, &target)  // Uses Lasso by default
  • ✅ Selects individual lags (e.g., only lag-2)
  • ✅ Good for sparse/noisy relationships
  • ✅ Automatic feature selection
  • ❌ Can produce "spiky" patterns unlike manual OLS
  • ❌ May select only intercept with small data

Use when: You want aggressive feature selection or have many potential drivers.

3. Prefix CV (automatic block lag selection)

fit_auto_prefix(&drivers, &target, None)
  • ✅ Automatically chooses L via rolling CV
  • ✅ Maintains contiguous blocks [0..L-1] (like manual)
  • ✅ Produces models that behave like manual OLS
  • ✅ No hand-tuning required
  • ❌ Slower than manual (searches over L)

Use when: You want "automatic Manual OLS" - interpretable lag spreads without hand-picking lengths.

Quick Comparison

Approach Lag Structure Example Taps Intercept Best For
Manual OLS Contiguous [0..L-1] you pick [0.5, 0.35, 0.15] Small Known lag patterns
Sparse (Lasso) Individual lags selected [0, 0, 1.0] Large Feature selection
Prefix CV Contiguous [0..L-1] auto [0.24, 0, 0.76] Medium Auto + interpretable

See examples/ar_comparison.rs for a detailed comparison on real AR data.

Detailed Usage

Understanding FIR Profiles

A FIR profile defines how a driver influences working capital over time:

$$\text{Balance}(t) = \sum_{k} \text{scale}k \times \sum{j} \text{percentage}_{k,j} \times \text{Driver}_k(t - j)$$

  • Scale: Overall magnitude of the driver's impact
  • Percentages: Distribution of impact across time lags (typically sum to 1.0)
    • percentages[0]: Immediate impact (lag 0)
    • percentages[1]: One-period lag
    • percentages[2]: Two-period lag, etc.

Example Interpretation

ManualProfile {
    percentages: vec![0.5, 0.35, 0.15],
    scale: 0.8,
}

This means:

  • Total impact = 80% of driver value (scale = 0.8)
  • 50% appears immediately (lag 0)
  • 35% appears one period later (lag 1)
  • 15% appears two periods later (lag 2)

Interpreting scales versus percentages

  • scale is the gain applied to a driver after selection. A scale of 0.8 means the driver contributes 80 % of its value before the lag taps are applied.
  • The percentages vector distributes that gain across time. The entries always sum to one unless the driver’s scale is zero.
  • With OlsOptions { nonnegative: true, .. } (the default), negative tap weights are clipped to zero and the remaining entries are renormalised. That is why you might see [1.0, 0.0] when the second lag would otherwise be negative.
  • Disable the constraint (nonnegative: false) if you need the signed coefficients for further processing.

OLS Fitting Options

The OlsOptions struct provides fine-grained control over the fitting process:

use wc_fir::OlsOptions;

let opts = OlsOptions {
    // Add intercept term (baseline offset independent of drivers)
    intercept: true,

    // Ridge regularization strength (0.0 = none, higher = more shrinkage)
    ridge_lambda: 0.1,

    // Force all tap percentages to be non-negative
    nonnegative: true,
};

When to Use Each Option

Option Use When Example Scenario
intercept: true There's a baseline level independent of drivers Fixed overhead costs
ridge_lambda > 0 Drivers are correlated or data is limited Revenue and units sold move together
nonnegative: true Only positive relationships make sense Inventory can't decrease with sales

Working with Multiple Drivers

use wc_fir::{fit_ols, OlsOptions, manual_apply, ManualProfile};

// Scenario: Model accounts receivable based on:
// - Sales revenue (3-month impact window)
// - Credit sales ratio (2-month impact window)
// - Customer count (2-month impact window)

let sales_revenue = vec![100.0, 105.0, 110.0, 108.0, 115.0, 120.0, 125.0, 130.0];
let credit_ratio = vec![0.65, 0.68, 0.70, 0.72, 0.70, 0.68, 0.71, 0.73];
let customers = vec![450.0, 460.0, 470.0, 465.0, 480.0, 490.0, 500.0, 510.0];

let actual_ar = vec![65.0, 71.0, 77.0, 78.0, 81.0, 82.0, 89.0, 95.0];

// Fit model
let fit = fit_ols(
    &[sales_revenue, credit_ratio, customers],
    &actual_ar,
    &[3, 2, 2],  // Lag structure
    &OlsOptions {
        intercept: true,
        ridge_lambda: 0.05,  // Light regularization
        nonnegative: true,   // AR increases with these drivers
    },
).unwrap();

// Convert to manual profiles for forecasting
let profiles: Vec<ManualProfile> = fit.per_driver
    .into_iter()
    .map(|(scale, percentages)| ManualProfile { percentages, scale })
    .collect();

// Now use profiles for forecasting new periods

Handling Insufficient Data

The library automatically handles burn-in periods based on the maximum lag:

let drivers = vec![vec![1.0, 2.0, 3.0, 4.0, 5.0]];
let target = vec![10.0, 20.0, 30.0, 40.0, 50.0];

// With lag=3, first 2 periods (lag-1) are burn-in
// Effective fit window: periods 3-5 (indices 2-4)
let fit = fit_ols(&drivers, &target, &[3], &OlsOptions::default());

match fit {
    Ok(result) => {
        println!("Fitted {} rows after burn-in", result.n_rows);
    }
    Err(e) => {
        eprintln!("Error: {}", e);
    }
}

How It Works

Manual Mode: FIR Application

Manual mode applies a causal FIR filter to each driver:

  1. For each time period t:
    • Sum contributions from current period: scale * percentage[0] * driver[t]
    • Add lagged contributions: scale * percentage[j] * driver[t-j] for j > 0
  2. Aggregate across all drivers to produce final balance

This is a forward-looking convolution where lags reference historical driver values.

Auto Mode: OLS Estimation

Auto mode constructs a regression problem and solves it using Linfa (pure Rust ML):

Step 1: Design Matrix Construction

Build matrix X where each row represents a time period (after burn-in), and columns are:

[D1(t), D1(t-1), ..., D1(t-L1+1), D2(t), D2(t-1), ..., D2(t-L2+1), ...]

Example with 2 drivers, lags [2, 2]:

       D1(t)  D1(t-1)  D2(t)  D2(t-1)
t=1    100    95       50     48
t=2    105    100      52     50
t=3    110    105      55     52
...

Step 2: OLS Regression

Solve: $\min |X\beta - y|^2$ where $y$ is the target working capital series.

  • If intercept=true: Linfa adds column of ones automatically

  • If ridge_lambda > 0: Apply data augmentation (Tikhonov trick):

    $$ X_{\text{aug}} = \begin{bmatrix} X \\ \sqrt{\lambda}I \end{bmatrix}, \quad y_{\text{aug}} = \begin{bmatrix} y \\ 0 \end{bmatrix} $$

    This transforms ridge regression into OLS: $\min |X_{\text{aug}}\beta - y_{\text{aug}}|^2$

Step 3: Coefficient Mapping

Map raw coefficients β back to per-driver profiles:

  1. Partition β into driver blocks: [β1_0, ..., β1_L1-1, β2_0, ..., β2_L2-1, ...]
  2. For each driver block:
    • Scale = sum of coefficients in block
    • Percentages = each coefficient / scale (normalized tap weights)
  3. If nonnegative=true: Clip negatives to zero and renormalize

Step 4: Quality Metrics

  • RMSE: $\text{RMSE} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}$
  • : $R^2 = 1 - \frac{SS_{\text{res}}}{SS_{\text{tot}}}$ where:
    • $SS_{\text{res}} = \sum_{i}(y_i - \hat{y}_i)^2$ (sum of squared residuals)
    • $SS_{\text{tot}} = \sum_{i}(y_i - \bar{y})^2$ (total sum of squares)

Ridge Regression Implementation

Instead of using matrix inversion (which requires LAPACK), we use data augmentation:

$$ \begin{align*} \text{Original problem:} \quad & \min |X\beta - y|^2 + \lambda|\beta|^2 \ \text{Augmented problem:} \quad & \min \left|\begin{bmatrix} X \ \sqrt{\lambda}I \end{bmatrix}\beta - \begin{bmatrix} y \ 0 \end{bmatrix}\right|^2 \end{align*} $$

This allows us to use standard OLS solvers (Linfa's pure Rust QR decomposition) without external dependencies.

API Reference

Core Functions

manual_apply

pub fn manual_apply(
    drivers: &[Vec<f64>],
    profiles: &[ManualProfile],
) -> Result<Vec<f64>, FirError>

Apply manual FIR profiles to drivers.

Arguments:

  • drivers: Slice of driver time series (all must have same length)
  • profiles: Slice of FIR profiles (must match number of drivers)

Returns: Synthetic balance time series (same length as inputs)

Errors:

  • FirError::LengthMismatch if drivers/profiles counts differ or drivers have different lengths
  • FirError::EmptyInput if no drivers provided

fit_ols

pub fn fit_ols(
    drivers: &[Vec<f64>],
    target: &[f64],
    lags: &[Lag],
    opts: &OlsOptions,
) -> Result<OlsFit, FirError>

Estimate FIR taps from historical data using OLS.

Arguments:

  • drivers: Slice of driver time series
  • target: Target working capital series to fit
  • lags: Per-driver lag lengths (e.g., &[3, 2] for 3 lags on driver 1, 2 on driver 2)
  • opts: Fitting options (intercept, ridge, non-negativity)

Returns: OlsFit struct containing:

  • coeffs: Raw FIR coefficients (concatenated)
  • per_driver: Vec of (scale, percentages) tuples
  • rmse: Root mean squared error on fit window
  • r2: R-squared coefficient
  • n_rows: Number of periods used after burn-in

Errors:

  • FirError::LengthMismatch if inputs have mismatched lengths
  • FirError::InsufficientData if series too short for requested lags
  • FirError::Linalg if regression fails (e.g., singular matrix)

Data Types

ManualProfile

pub struct ManualProfile {
    pub percentages: Vec<f64>,  // Tap weights (typically sum to 1.0)
    pub scale: f64,              // Overall scaling factor
}

OlsOptions

pub struct OlsOptions {
    pub intercept: bool,      // Add intercept term (default: false)
    pub ridge_lambda: f64,    // Ridge penalty (default: 0.0)
    pub nonnegative: bool,    // Enforce non-negative taps (default: false)
}

OlsFit

pub struct OlsFit {
    pub coeffs: Vec<f64>,                    // Raw coefficients
    pub per_driver: Vec<(f64, Vec<f64>)>,   // (scale, percentages) per driver
    pub rmse: f64,                           // Root mean squared error
    pub r2: f64,                             // R-squared
    pub n_rows: usize,                       // Rows used in fit
}

FirError

pub enum FirError {
    LengthMismatch,
    InsufficientData { burn_in: usize },
    Linalg(String),
    EmptyInput,
}

Performance Considerations

Computational Complexity

  • Manual mode: O(T × M × L) where T = time periods, M = drivers, L = avg lag length
  • OLS fit: O(T × P² + P³) where P = total parameters (sum of lags)
    • Dominated by QR decomposition: O(T × P²) when T > P (typical case)
    • Matrix formation: O(T × P)

Memory Usage

  • Design matrix: T × P × 8 bytes (f64)
  • Augmented matrix (ridge): (T + P) × P × 8 bytes

For typical working capital models:

  • T = 36-120 periods (3-10 years monthly data)
  • P = 5-20 parameters (2-4 drivers, 2-5 lags each)
  • Memory: < 100 KB

Optimization Tips

  1. Use ridge regularization for better numerical stability when:

    • Drivers are highly correlated
    • Limited historical data (T < 2P)
  2. Choose appropriate lag lengths:

    • Too long: Overfitting, higher variance
    • Too short: Underfitting, biased estimates
    • Rule of thumb: L ≤ T/4
  3. Preprocessing:

    • Normalize/standardize drivers if they have very different scales
    • Consider log-transforming drivers with exponential growth

Troubleshooting

Common Errors

"Linalg(NonInvertible)" or singular matrix

Cause: Perfect collinearity in design matrix (e.g., two drivers are identical or linearly dependent)

Solutions:

// Option 1: Use ridge regression
let opts = OlsOptions {
    ridge_lambda: 0.1,
    ..Default::default()
};

// Option 2: Remove or combine collinear drivers
// Option 3: Reduce lag lengths

"InsufficientData"

Cause: Time series too short for requested lags

Solution: Reduce lag lengths or provide more historical data

// Need at least max(lags) periods for burn-in
// For lags=[3, 2], need at least 3 data points

Poor R² despite low RMSE

Cause: Target has low variance or model captures level but not variation

Solution:

  • Check if intercept should be enabled/disabled
  • Verify drivers are actually predictive of target
  • Consider differencing/detrending the series

Examples

Run any of the binaries below with cargo run --example <name> to reproduce the printed diagnostics:

  • manual_mode – apply user-specified taps

    Manual FIR output:
      t=0: 76.400
      t=1: 118.960
      t=2: 141.240
      t=3: 156.930
      t=4: 169.570
      t=5: 180.620
    
  • ols_basic – fit a known lag structure through fit_ols

    Plain OLS coefficients: [0.5927163149316239, 0.20438268126703263, 0.3500195556888653]
    Driver 0: scale=0.797, percentages=[0.7435918471335078, 0.2564081528664922]
    Driver 1: scale=0.350, percentages=[1.0]
    RMSE=0.2074  R2=0.9999
    Intercept=5.5111
    
  • auto_lasso – use fit_auto (Lasso selection + OLS refit)

    Lasso-selected lags: [3, 2]
    Driver 0: scale=0.234, taps=[0.5343219823863578, 0.4656780176136423, 0.0]
    Driver 1: scale=1.074, taps=[0.40716722319196963, 0.5928327768080305]
    RMSE=0.1956  R2=0.9998
    CV RMSE=13.2128
    Intercept=12.2296
    
  • ic_bic – grid search using the BIC criterion

    BIC-selected lags: [0, 1]
    Driver 0: scale=0.000, taps=[]
    Driver 1: scale=1.420, taps=[1.0]
    RMSE=0.021724  R2=0.999478
    Intercept=0.453319
    
  • screening – correlation screen followed by BIC prune

    Screening kept 1 taps
    Driver 0: scale=1.038, taps=[1.0]
    Driver 1: scale=0.000, taps=[]
    Driver 2: scale=0.000, taps=[]
    RMSE=0.004169  R2=0.999986
    CV RMSE=N/A
    Intercept=0.555980
    

Technical Details

Pure Rust Implementation

This library uses Linfa (Rust ML toolkit) for linear regression, which provides:

  • Pure Rust QR decomposition (no BLAS/LAPACK required)
  • Cross-platform compatibility (Windows, Linux, macOS, even WebAssembly)
  • Predictable performance without external library dependencies

Mathematical Foundation

The FIR model is a discrete-time linear time-invariant (LTI) system:

$$y[t] = \sum_{k=1}^{M} \left(\text{scale}k \times \sum{j=0}^{L_k-1} h_k[j] \times \text{driver}_k[t-j]\right)$$

Where:

  • $h_k[j]$ are the FIR tap weights (percentages)
  • $\text{scale}_k$ is the driver-specific gain
  • $M$ is the number of drivers
  • $L_k$ is the lag length for driver $k$

This is equivalent to a distributed lag model in econometrics or a convolutional layer in deep learning (without bias or activation).

Contributing

Contributions are welcome! Areas of interest:

  • Additional regularization methods (elastic net, LASSO)
  • Time-varying coefficients
  • Seasonal adjustment utilities
  • More comprehensive examples

Please open an issue to discuss significant changes before submitting a PR.

License

Licensed under either of:

at your option.

Acknowledgments

Built with:

  • ndarray - N-dimensional arrays for Rust
  • Linfa - Pure Rust machine learning toolkit
  • thiserror - Error handling

Inspired by classical signal processing FIR filters and econometric distributed lag models.


Made with ❤️ in Rust | Report Issues | Documentation

Commit count: 0

cargo fmt