| Crates.io | tokengeex |
| lib.rs | tokengeex |
| version | 1.1.0 |
| created_at | 2024-02-18 01:49:41.790073+00 |
| updated_at | 2024-06-03 03:08:24.274661+00 |
| description | TokenGeeX is an efficient tokenizer for code based on UnigramLM and TokenMonster. |
| homepage | https://codegeex.cn |
| repository | https://github.com/rojas-diego/tokengeex/ |
| max_upload_size | |
| id | 1143684 |
| size | 208,803 |
This repository holds the code for the TokenGeeX Rust crate and Python package. TokenGeeX is a tokenizer for CodeGeeX aimed at code and Chinese. It is based on UnigramLM (Taku Kudo 2018).