| Crates.io | dec_from_char |
| lib.rs | dec_from_char |
| version | 0.2.0 |
| created_at | 2025-07-06 15:56:51.157691+00 |
| updated_at | 2025-07-10 11:34:58.328147+00 |
| description | Small library for converting unicode decimal into numbers |
| homepage | |
| repository | https://github.com/vloldik/dec_from_char |
| max_upload_size | |
| id | 1740214 |
| size | 32,051 |
A tiny, zero-cost Rust library to correctly parse any Unicode decimal digit.
Ever needed to parse a number from a string, but it might contain digits from other languages like ९ (Devanagari nine) or ٣ (Arabic-Indic three)? The standard char::to_digit in Rust only handles ASCII digits well. This crate extends that power to all Unicode characters in the "Decimal Number (Nd)" category.
match statement. This means converting a character at runtime is a zero-cost abstraction with no overhead.DecimalExtended, for the char type. If you know how to use Rust, you already know how to use this.Add dec_from_char to your Cargo.toml:
[dependencies]
dec_from_char = "0.2.0" # Replace with the latest version
Use the DecimalExtended trait to convert characters.
use dec_from_char::DecimalExtended;
fn main() {
// Works for common ASCII digits
assert_eq!('7'.to_decimal_utf8(), Some(7));
// And for a wide range of other Unicode digits!
assert_eq!('९'.to_decimal_utf8(), Some(9)); // Devanagari
assert_eq!('०'.to_decimal_utf8(), Some(0)); // Devanagari
assert_eq!('7'.to_decimal_utf8(), Some(7)); // Fullwidth
assert_eq!('٣'.to_decimal_utf8(), Some(3)); // Extended Arabic-Indic
// It gracefully returns None for non-digit characters
assert_eq!('a'.to_decimal_utf8(), None);
assert_eq!('🎉'.to_decimal_utf8(), None);
// Normalization
assert_eq!('٣'.normalize_decimal(), Some('3'));
assert_eq!('7'.normalize_decimal(), Some('7'));
assert_eq!('🎉'.normalize_decimal(), None);
}
This crate makes it trivial to extract numbers from text, no matter how they are formatted.
use dec_from_char::DecimalExtended;
let messy_string = "Phone number: (0)𝟗𝟖-𝟳𝟲𝟱 and pin: ٣-١-٤-١";
let digits: String = messy_string.chars()
.filter_map(|c| c.normalize_decimal()) // Convert each char to a digit if possible
.collect();
assert_eq!(digits, "0987653141");
// you can do the same with `normalize_decimals_filtering`
assert_eq!(normalize_decimals_filtering(messy_string) "0987653141");
// or you can normalize digits keeping rest chars
assert_eq!(normalize_decimals(messy_string), "Phone number: (0)98-765 and pin: 3-1-4-1");
println!("Extracted digits: {}", digits); // "0987653141"
This crate contains two main parts:
UnicodeData.txt file at compile time.When you compile your project, the macro scans the Unicode data file for every character that is a decimal digit (category Nd). It then generates a massive, but hyper-efficient, match statement that maps each of these characters to its u8 value (0-9).
This generated code is then compiled directly into your binary. The result? At runtime, calling .to_decimal_utf8() is as fast as it gets, with no searching, parsing, or hashmaps involved.
The crate exposes a single trait:
pub trait DecimalExtended
fn to_decimal_utf8(&self) -> Option<u8>: Converts any decimal Unicode digit in the Nd category to a u8. Returns None if the character is not a decimal digit.fn is_decimal_utf8(&self) -> bool: A convenience method that returns true if the character is a decimal digit.This project is licensed under either of
at your option.
Contributions, issues, and feature requests are welcome! Feel free to check the issues page.