Crates.io | cesu8-str |
lib.rs | cesu8-str |
version | 1.2.1 |
source | src |
created_at | 2024-05-09 04:36:27.352029 |
updated_at | 2024-06-05 22:29:40.270683 |
description | CESU-8 and Java CESU-8 string validation and manipulation. |
homepage | |
repository | https://github.com/WillemEvanson/cesu8 |
max_upload_size | |
id | 1234841 |
size | 152,021 |
Converts between normal UTF-8 and CESU-8 encodings.
CESU-8 encodes characters outside the Basic Multilingual Plane as two UTF-16 surrogate characters, which are then re-encoded as 3-byte UTF-8 characters. This means that 4-byte UTF-8 sequences become 6-byte CESU-8 sequences.
We also support Java's Modified UTF-8 encoding, which uses a variant of CESU-8
encoding \0
using a two-byte sequence.