| Crates.io | cesu8-str |
| lib.rs | cesu8-str |
| version | 1.2.1 |
| created_at | 2024-05-09 04:36:27.352029+00 |
| updated_at | 2024-06-05 22:29:40.270683+00 |
| description | CESU-8 and Java CESU-8 string validation and manipulation. |
| homepage | |
| repository | https://github.com/WillemEvanson/cesu8 |
| max_upload_size | |
| id | 1234841 |
| size | 152,021 |
Converts between normal UTF-8 and CESU-8 encodings.
CESU-8 encodes characters outside the Basic Multilingual Plane as two UTF-16 surrogate characters, which are then re-encoded as 3-byte UTF-8 characters. This means that 4-byte UTF-8 sequences become 6-byte CESU-8 sequences.
We also support Java's Modified UTF-8 encoding, which uses a variant of CESU-8
encoding \0 using a two-byte sequence.