sklears-feature-extraction

Crates.iosklears-feature-extraction
lib.rssklears-feature-extraction
version0.1.0-beta.1
created_at2025-10-13 14:43:15.263709+00
updated_at2026-01-01 21:36:06.420534+00
descriptionFeature extraction from raw data (text, images)
homepagehttps://github.com/cool-japan/sklears
repositoryhttps://github.com/cool-japan/sklears
max_upload_size
id1880607
size1,949,072
KitaSan (cool-japan)

documentation

README

sklears-feature-extraction

Crates.io Documentation License Minimum Rust Version

Latest release: 0.1.0-beta.1 (January 1, 2026). See the workspace release notes for highlights and upgrade guidance.

Overview

sklears-feature-extraction contains text, signal, and image feature transformers designed to mirror scikit-learn’s feature extraction API with Rust-first performance.

Key Features

  • Text Processing: CountVectorizer, TfidfVectorizer, HashingVectorizer, N-gram analyzers, character models.
  • Image Features: Patch extraction, HOG descriptors, SIFT-like outlines, and GPU pipelines.
  • Signal Features: Windowed statistics, spectrograms, wavelet transforms, and FFT-based descriptors.
  • Pipeline Support: Integrates with sklears preprocessing, selection, and model selection crates.

Quick Start

use sklears_feature_extraction::text::TfidfVectorizer;

let docs = vec![
    "Rust brings fearless concurrency",
    "Machine learning in Rust is fast",
];

let vectorizer = TfidfVectorizer::builder()
    .ngram_range((1, 2))
    .min_df(1)
    .max_features(Some(4096))
    .build();

let tfidf = vectorizer.fit_transform(&docs)?;

Status

  • Extensively tested via the 11,292 passing workspace suites shipped in 0.1.0-beta.1.
  • Offers >99% parity with scikit-learn’s feature extraction module, plus GPU paths.
  • Additional work (streaming text ingestion, audio-specific transforms) documented in TODO.md.
Commit count: 0

cargo fmt