cesu8-str

Crates.iocesu8-str
lib.rscesu8-str
version1.2.1
sourcesrc
created_at2024-05-09 04:36:27.352029
updated_at2024-06-05 22:29:40.270683
descriptionCESU-8 and Java CESU-8 string validation and manipulation.
homepage
repositoryhttps://github.com/WillemEvanson/cesu8
max_upload_size
id1234841
size152,021
Willem Evanson (WillemEvanson)

documentation

README

CESU-8 Encoder & Decoder

Converts between normal UTF-8 and CESU-8 encodings.

CESU-8 encodes characters outside the Basic Multilingual Plane as two UTF-16 surrogate characters, which are then re-encoded as 3-byte UTF-8 characters. This means that 4-byte UTF-8 sequences become 6-byte CESU-8 sequences.

We also support Java's Modified UTF-8 encoding, which uses a variant of CESU-8 encoding \0 using a two-byte sequence.

Commit count: 24

cargo fmt