Crates.io | ppca |
lib.rs | ppca |
version | 0.5.0 |
source | src |
created_at | 2022-12-16 20:30:42.685834 |
updated_at | 2024-08-10 16:41:52.735373 |
description | Rust implementation of the Probabilistic Principal Component Analysis model |
homepage | https://github.com/findHotel/ppca_rs |
repository | https://github.com/findHotel/ppca_rs |
max_upload_size | |
id | 739170 |
size | 102,543 |
This project implements a PPCA model implemented in Rust for Python using pyO3
and maturin
.
This package is available in PyPI!
pip install ppca-rs
And you can also use it natively in Rust:
cargo add ppca
Glad you asked!
ppca-rs
?That's an easy one!
It's written in Rust, with only a bit of Python glue on top. You can expect a performance in the same leage as of C code.
It uses rayon
to paralellize computations evenly across as many CPUs as you have.
It also uses fancy Linear Algebra Trickery Technology to reduce computational complexity in key bottlenecks.
Battle-tested at Vio.com with some ridiculously huge datasets.
import numpy as np
from ppca_rs import Dataset, PPCATrainer, PPCA
samples: np.ndarray
# Create your dataset from a rank 2 np.ndarray, where each line is a sample.
# Use non-finite values (`inf`s and `nan`) to signal masked values
dataset = Dataset(samples)
# Train the model (convenient edition!):
model: PPCAModel = PPCATrainer(dataset).train(state_size=10, n_iters=10)
# And now, here is a free sample of what you can do:
# Extrapolates the missing values with the most probable values:
extrapolated: Dataset = model.extrapolate(dataset)
# Smooths (removes noise from) samples and fills in missing values:
extrapolated: Dataset = model.filter_extrapolate(dataset)
# ... go back to numpy:
eextrapolated_np = extrapolated.numpy()
Tired of the linear? We have support for PPCA mixture models. Make the most of your data with clustering and dimensionality reduction in a single tool!
Support for adaptation of DataFrames using either pandas
or polars
. Never juggle those df
s in your code again.
You will need Rust, which can be installed locally (i.e., without sudo
) and you will also need maturin
, which can be installed by
pip install maturin
pipenv
is also a good idea if you are going to mess around with it locally. At least, you need a venv
set, otherwise, maturin
will complain with you.
Check the Makefile
for the available commands (or just type make
). To install it locally, do
make install # optional: i=python.version (e.g, `i=3.9`)
To mess around, inside a virtual environment (a Pipfile
is provided for the pipenv
lovers), do
maturin develop # use the flag --release to unlock superspeed!
This will install the package locally as is from source.
See the examples in the examples
folder. Also, all functions are type hinted and commented. If you are using pylance
or mypy
, it should be easy to navigate.
You bet!