| Crates.io | tktax-transaction-category |
| lib.rs | tktax-transaction-category |
| version | 0.2.2 |
| created_at | 2025-02-01 01:43:37.863272+00 |
| updated_at | 2025-02-01 01:43:37.863272+00 |
| description | A Rust library for categorizing financial transactions using Porter stemming, CSV-driven classification, and advanced trait-based extensibility. |
| homepage | |
| repository | https://github.com/klebs6/tktax |
| max_upload_size | |
| id | 1537994 |
| size | 109,162 |
tktax-transaction-category provides robust categorization (Lat. Categorĭa, Gr. Κατηγορία) for financial transactions. It leverages a Porter stemmer to reduce vendor descriptions to root forms, then maps tokens onto enumerated categories. This approach supports composable classification, large reference tables (CSV-based), and advanced trait constraints for production usage.
Porter-Stem-Based Tokenization
Stemmer from rust-stemmers is invoked to canonicalize words.CategoryMap
HashMap<StemmedToken, HashSet<TxCat>> mapping tokens to one or more categories.Predictive Scoring
predict_category(desc, &CategoryMap) computes a distribution of categories.1 / N if it maps to N categories.Trait-Based Extension
TransactionCategory trait constraints unify advanced classification methods.Flexible CSV Ingestion
GetCategoryGoldenCsv::category_golden_csv() supplies a reference dataset.// Use your own category enum implementing `TransactionCategory`
use tktax_transaction_category::{
CategoryMap,
predict_category,
TransactionCategory,
// ... other necessary imports
};
// Construct the map from a "golden" CSV reference
let cat_map = CategoryMap::<MockTransactionCategory>::new();
// Some unknown transaction description
let transaction_description = "AMZN MKTPLACE MED - FIRST AID KITS and extras";
// Predict the categories (in descending score order)
let predictions = predict_category(transaction_description, &cat_map);
// Evaluate the top prediction or inspect the full distribution
if let Some(best) = predictions.first() {
println!("Likely category: {:?} with score {:?}", best.category(), best.score());
}
cat1;cat2,desc) unify them for each recognized token in desc.medical_and_insurance_categories).Extensive unit tests can be found under #[cfg(test)] within each module:
To run tests:
cargo test --package tktax-transaction-category
This project is licensed under the MIT License.
git checkout -b my-new-feature).git commit -am 'Add new feature').git push origin my-new-feature).Happy categorizing!