stylometry-analyzer

Crates.iostylometry-analyzer
lib.rsstylometry-analyzer
version0.1.1
created_at2026-01-03 21:30:09.497627+00
updated_at2026-01-04 02:15:46.963212+00
descriptionMinimal CLI tool that combines one or more `.txt` files, extracts user-authored text, and enforces a minimum size. Hash-embeds text chunks and queries a local vector DB to classify writing style. Not semantic/tone based search for now. Prints results to stdout and writes `results.json`.
homepagehttps://github.com/thegreatbey/stylometry
repositoryhttps://github.com/thegreatbey/stylometry
max_upload_size
id2020828
size83,100
(thegreatbey)

documentation

README

Stylometry Analyzer

Rust License CI Crates.io Docs.rs

Minimal CLI tool that:

  • Combines one or more .txt files, extracts user-authored text, and enforces a minimum size.
  • Hash-embeds text chunks and queries a local vector DB to classify writing style.
  • Prints results to stdout and writes results.json.

Prerequisites

  • Rust toolchain (stable)
  • yvdb running locally at http://127.0.0.1:8080

Usage

cargo run -- --file path\to\file1.txt path\to\file2.txt

Spaces in paths are fine if quoted.

Notes

  • stylometry analyzer requires a minimum of 70kb text. Ideally I'd like to up this to 750kb.
  • multiple text files can be used and minimal delimiters are used in main.rs to slice out the text so user's voice is remaining alone...mitigate noise, basically.
  • On first run, the tool seeds reference embeddings into yvdb; later runs skip seeding.
  • Progress prints show the current step and pre-seed status.
Commit count: 0

cargo fmt