icu_uniset

Crates.ioicu_uniset
lib.rsicu_uniset
version0.5.0
sourcesrc
created_at2020-10-15 15:26:27.681454
updated_at2022-05-10 18:43:42.384942
descriptionAPI for highly efficient querying of sets of Unicode characters
homepage
repositoryhttps://github.com/unicode-org/icu4x
max_upload_size
id300035
size126,406
icu4x-release (github:unicode-org:icu4x-release)

documentation

README

icu_uniset crates.io

icu_uniset is a utility crate of the ICU4X project.

This API provides necessary functionality for highly efficient querying of sets of Unicode characters.

It is an implementation of the existing ICU4C UnicodeSet API.

Architecture

ICU4X [UnicodeSet] is split up into independent levels, with [UnicodeSet] representing the membership/query API, and [UnicodeSetBuilder] representing the builder API. A Properties API is in future works.

Examples:

Creating a UnicodeSet

UnicodeSets are created from either serialized UnicodeSets, represented by inversion lists, the [UnicodeSetBuilder], or from the TBA Properties API.

use icu_uniset::{UnicodeSet, UnicodeSetBuilder};

let mut builder = UnicodeSetBuilder::new();
builder.add_range(&('A'..'Z'));
let set: UnicodeSet = builder.build();

assert!(set.contains('A'));

Querying a UnicodeSet

Currently, you can check if a character/range of characters exists in the [UnicodeSet], or iterate through the characters.

use icu_uniset::{UnicodeSet, UnicodeSetBuilder};

let mut builder = UnicodeSetBuilder::new();
builder.add_range(&('A'..'Z'));
let set: UnicodeSet = builder.build();

assert!(set.contains('A'));
assert!(set.contains_range(&('A'..='C')));
assert_eq!(set.iter_chars().next(), Some('A'));

More Information

For more information on development, authorship, contributing etc. please visit ICU4X home page.

Commit count: 3247

cargo fmt