created_at2024-04-02 21:18:15.152462
updated_at2024-04-04 00:40:11.96398
descriptionProvides a safe interface for interacting with multi-byte strings in Rust, namely IndexedStr, IndexedString, and IndexedSlice
Sam Johnson



# safe-string [![](]( [![](]( [![Build Status](]( [![MIT License](]( This crate provides replacement types for `String` and `&str` that allow for safe indexing by character to avoid panics and the usual pitfalls of working with multi-byte UTF-8 characters, namely the scenario where the _byte length_ of a string and the _character length_ of that same string are not the same. Specifically, `IndexedString` (replaces `String`) and `IndexedSlice` (replaces `&str`) allow for O(1) slicing and indexing by character, and they will never panic when indexing or slicing. This is accomplished by storing the character offsets of each character in the string, along with the original `String`, and using this information to calculate the byte offsets of each character on the fly. Thus `IndexedString` uses ~2x the memory of a normal `String`, but `IndexedSlice` and other types implementing `IndexedStr` have only one `usize` extra in overhead over that of a regular `&str` slice / fat pointer. In theory this could be reduced down to the same size as a fat pointer using unsafe rust, but this way we get to have completely safe code and the difference is negligible. ## Examples ```rust use safe_string::{IndexedString, IndexedStr, IndexedSlice}; let message = IndexedString::from("Hello, δΈ–η•Œ! πŸ‘‹πŸ˜Š"); assert_eq!(message.as_str(), "Hello, δΈ–η•Œ! πŸ‘‹πŸ˜Š"); assert_eq!(message, "Hello, δΈ–η•Œ! πŸ‘‹πŸ˜Š"); // handy PartialEq impls // Access characters by index assert_eq!(message.char_at(7), Some('δΈ–')); assert_eq!(message.char_at(100), None); // Out of bounds access returns None // Slice the IndexedString let slice = message.slice(7..9); assert_eq!(slice.as_str(), "δΈ–η•Œ"); // Convert slice back to IndexedString let sliced_message = slice.to_indexed_string(); assert_eq!(sliced_message.as_str(), "δΈ–η•Œ"); // Nested slicing let slice = message.slice(0..10); let nested_slice = slice.slice(3..6); assert_eq!(nested_slice.as_str(), "lo,"); // Display byte length and character length assert_eq!(IndexedString::from_str("δΈ–η•Œ").byte_len(), 6); // "δΈ–η•Œ" is 6 bytes in UTF-8 assert_eq!(IndexedString::from_str("δΈ–η•Œ").len(), 2); // "δΈ–η•Œ" has 2 characters // Demonstrate clamped slicing (no panic) let clamped_slice = message.slice(20..30); assert_eq!(clamped_slice.as_str(), ""); // Using `as_str` to interface with standard Rust string handling let slice = message.slice(0..5); let standard_str_slice = slice.as_str(); assert_eq!(standard_str_slice, "Hello"); ```
Commit count: 29

cargo fmt