| Crates.io | packed-char |
| lib.rs | packed-char |
| version | 0.1.2 |
| created_at | 2024-03-19 01:28:54.987797+00 |
| updated_at | 2024-09-05 21:25:51.90401+00 |
| description | Stores a char or a 22-bit integer in 32 bits |
| homepage | |
| repository | https://github.com/tim-harding/packed-char |
| max_upload_size | |
| id | 1178657 |
| size | 12,151 |
Allows either a char or a 22-bit integer to be stored in 32 bits, the same
size as a char.
packed-char takes advantage of the valid ranges for a char to determine what
type of data is stored. These ranges are 0..0xD800 and 0xDFFF..0x10FFFF (see
the documentation for
char). The range
0xD800..=0xDFFF contains surrogate code points, which are not valid UTF-8
characters. chars are stored unmodified. To store a u22 without overlapping
valid char ranges, it is first split it into two 11-bit chunks. The left chunk
is stored in the leading bits, which chars never overlap with. The right chunk
is stored in the trailing bits, which do overlap the bits used by chars. To
make this work, take note of the bit pattern in the surrogate range:
1101100000000000 // Start
1101111111111111 // End
^^^^^
The leading 5 bits are constant in this range. Referred to here as the surrogate
mask, they serve as a signature for u22 values. They are set along with the
left and right 11-bit chunks:
11111111111 00000 11011 11111111111
left chunk | unused | surrogate mask | right chunk
Now we have two cases:
char::MAX.Thus, char and u22 values are disambiguated.