tokengeex

Crates.iotokengeex
lib.rstokengeex
version1.1.0
sourcesrc
created_at2024-02-18 01:49:41.790073
updated_at2024-06-03 03:08:24.274661
descriptionTokenGeeX is an efficient tokenizer for code based on UnigramLM and TokenMonster.
homepagehttps://codegeex.cn
repositoryhttps://github.com/rojas-diego/tokengeex/
max_upload_size
id1143684
size208,803
Diego ROJAS (rojas-diego)

documentation

https://docs.rs/tokengeex/

README

TokenGeeX - Efficient Tokenizer for CodeGeeX

This repository holds the code for the TokenGeeX Rust crate and Python package. TokenGeeX is a tokenizer for CodeGeeX aimed at code and Chinese. It is based on UnigramLM (Taku Kudo 2018).

Commit count: 175

cargo fmt