| Crates.io | tiniestsegmenter |
| lib.rs | tiniestsegmenter |
| version | 0.3.0 |
| created_at | 2024-05-11 16:08:44.078953+00 |
| updated_at | 2024-09-24 13:27:02.203242+00 |
| description | Compact Japanese segmenter |
| homepage | https://github.com/jwnz/tiniestsegmenter |
| repository | https://github.com/jwnz/tiniestsegmenter |
| max_upload_size | |
| id | 1236896 |
| size | 60,236 |
A port of TinySegmenter written in pure, safe rust with no dependencies. You can find bindings for both Rust and Python.
TinySegmenter is an n-gram word tokenizer for Japanese text originally built by Taku Kudo (2008).
Usage
Add the crate to your project: cargo add tiniestsegmenter.
use tiniestsegmenter as ts;
fn main() {
let tokens: Vec<&str> = ts::tokenize("ジャガイモが好きです。");
}