spm_precompiled

Crates.iospm_precompiled
lib.rsspm_precompiled
version0.1.4
sourcesrc
created_at2020-09-15 19:26:09.868555
updated_at2022-05-30 12:08:06.151975
descriptionThis crate aims to emulate https://github.com/google/sentencepiece Dart::DoubleArray struct and it's Normalizer. This crate is highly specialized and not intended for general use.
homepagehttps://github.com/huggingface/spm_precompiled
repositoryhttps://github.com/huggingface/spm_precompiled
max_upload_size
id289138
size2,211,609
Nicolas Patry (Narsil)

documentation

https://docs.rs/spm_precompiled/

README

Crate API

spm_precompiled

This crate aims to emulate https://github.com/google/sentencepiece Dart::DoubleArray struct and it's Normalizer. It's main intent is to be used with tokenizers that is a Rust library that aims to provide facilities to tokenize string for use with HuggingFace's transformers library

This crate is highly specialized and not intended for general use.

The core of the algorithm is to read spm's binary precompiled_charsmap.

Commit count: 12

cargo fmt