Crates.io | ef80escape |
lib.rs | ef80escape |
version | 0.1.0 |
source | src |
created_at | 2023-02-04 18:29:56.699289 |
updated_at | 2023-02-04 18:29:56.699289 |
description | A domain-specific-language to select commits in a git repo. Similar to Mercurial's revset. |
homepage | |
repository | https://github.com/quark-zju/ef80escape |
max_upload_size | |
id | 776549 |
size | 15,549 |
Lossless conversion between UTF-8 and bytes in Rust. Optimized for UTF-8 content.
Non-UTF-8 bytes (>= 128) are encoded in a subset of Unicode Private Use Area U+EF80
..U+EFFF
. Conflicted Unicode characters are escaped by prefixing U+EF00
.
This can be useful to pass mostly UTF-8 but occasionally invalid UTF-8 data as text-only format like JSON for processing, after receiving the processed text back, reconstruct the original data losslessly.
Unlike PEP 383, wtf8, cesu8, this library produces standard-conformant UTF-8 compatible with Rust's str
.
The name ef80escape
is chosen because it's similar to Python's surrogateescape
but instead of surrogates, it uses a different range starting with U+EF80
.
Refer to the documentation for examples.