Crates.io | mstr |
lib.rs | mstr |
version | 0.1.4 |
source | src |
created_at | 2023-11-23 03:55:29.720666 |
updated_at | 2023-11-24 21:20:16.966296 |
description | MStr is a 2-word, immutable Cow |
homepage | |
repository | https://github.com/Sky9x/mstr |
max_upload_size | |
id | 1045790 |
size | 41,921 |
MStr
is a 2-word, immutable version of Cow<str>
.
Cow<str>
is 4 words large, and not all applications need mutability.
Thus, MStr
was born. It is just like Cow<str>
, but is immutable and half the size.
The name is short for "maybe string", because it may be borrowed or owned (silly, I know).
You can think of it like an enum storing either a &'a str
or a Box<str>
.
However, such an enum would be 3 words (ptr/usize size) large,
because the single bit discriminant has to take up an entire word due to alignment.
But we can go smaller. If we can find somewhere to fold the discriminant bit (indicating if we are borrowed or owned) into the other fields (ptr and len), we can cut the size of this type by another word.
There are 3 potential places to put it:
The top bit of the pointer:
Some architectures reserve the top bit for pointer tagging
(ARM actually reserves the top 8 bits).
However, not all architectures do this, so it would significantly reduce the portability of this library
(and I don't want to write any platform specific code), making this is a no-go.
The bottom bit of the pointer:
This doesn't suffer from the portability issues of high bit tagging,
but requires that the pointer is suitably aligned
(because if the address is a multiple of eg. 16, the bottom 4 bits must always be zero because Mathâ„¢).
Unfortunately, strings in Rust are 1-byte aligned, so this won't work.
The top bit of len:
This is the last viable option, so it better work (spoiler: it does).
Rust limits all allocations to a maximum of isize::MAX
bytes,
which means that the high bit is always zero, so we can use it for tagging. Yay!
So, this crate folds the discriminant bit into the high bit of the length field,
with 1
representing owned and 0
representing borrowed.
However, this is completely transparent to users of this crate,
so you don't need to worry it.
Happy smaller string-ing!
This crate has 1 feature (off by default):
serde
: Implement's Serialize
& Deserialize
for MStr
.
Deserialization always returns an owned MStr
(same behavior as Cow
).
This crate does not require the standard library (it is marked #![no_std]
),
but it does require alloc
(obviously).
Contributions on GitHub are welcome! Feel free to open a PR or an issue for anything!
This project is licensed under the MIT license OR the Apache 2.0 license, at your choice.