sklears-feature-extraction

Crates.io	sklears-feature-extraction
lib.rs	sklears-feature-extraction
version	0.1.0-beta.1
created_at	2025-10-13 14:43:15.263709+00
updated_at	2026-01-01 21:36:06.420534+00
description	Feature extraction from raw data (text, images)
homepage	https://github.com/cool-japan/sklears
repository	https://github.com/cool-japan/sklears
max_upload_size
id	1880607
size	1,949,072

KitaSan (cool-japan)

documentation

README

sklears-feature-extraction

Latest release: 0.1.0-beta.1 (January 1, 2026). See the workspace release notes for highlights and upgrade guidance.

Overview

sklears-feature-extraction contains text, signal, and image feature transformers designed to mirror scikit-learn’s feature extraction API with Rust-first performance.

Key Features

Text Processing: CountVectorizer, TfidfVectorizer, HashingVectorizer, N-gram analyzers, character models.
Image Features: Patch extraction, HOG descriptors, SIFT-like outlines, and GPU pipelines.
Signal Features: Windowed statistics, spectrograms, wavelet transforms, and FFT-based descriptors.
Pipeline Support: Integrates with sklears preprocessing, selection, and model selection crates.

Quick Start

use sklears_feature_extraction::text::TfidfVectorizer;

let docs = vec![
    "Rust brings fearless concurrency",
    "Machine learning in Rust is fast",
];

let vectorizer = TfidfVectorizer::builder()
    .ngram_range((1, 2))
    .min_df(1)
    .max_features(Some(4096))
    .build();

let tfidf = vectorizer.fit_transform(&docs)?;

Status

Extensively tested via the 11,292 passing workspace suites shipped in 0.1.0-beta.1.
Offers >99% parity with scikit-learn’s feature extraction module, plus GPU paths.
Additional work (streaming text ingestion, audio-specific transforms) documented in TODO.md.

Commit count: 0

sklears-feature-extraction

documentation

README

sklears-feature-extraction

Overview

Key Features

Quick Start

Status

cargo fmt