Crates.io | wordpieces |
lib.rs | wordpieces |
version | 0.6.1 |
source | src |
created_at | 2019-11-29 07:56:17.387456 |
updated_at | 2022-10-10 14:57:06.314498 |
description | Split tokens into word pieces |
homepage | https://github.com/danieldk/wordpieces |
repository | |
max_upload_size | |
id | 185271 |
size | 27,220 |
This crate provides a subword tokenizer. A subword tokenizer splits a token into several pieces, so-called word pieces. Word pieces were popularized by and used in the BERT natural language encoder.