unicode-intervals

Crates.iounicode-intervals
lib.rsunicode-intervals
version0.2.0
sourcesrc
created_at2023-04-23 17:41:54.17069
updated_at2023-04-24 22:32:46.474807
descriptionSearch for Unicode code points intervals by including/excluding categories, ranges, and custom characters sets.
homepagehttps://github.com/Stranger6667/unicode-intervals
repositoryhttps://github.com/Stranger6667/unicode-intervals
max_upload_size
id846706
size679,045
Dmitry Dygalo (Stranger6667)

documentation

https://github.com/Stranger6667/unicode-intervals

README

unicode-intervals

github crates.io docs.rs build status codecov.io

This library provides a way to search for Unicode code point intervals by categories, ranges, and custom character sets.

The main purpose of unicode-intervals is to simplify generating strings that matching specific criteria.

[dependencies]
unicode-intervals = "0.1"

Examples

The example below will produce code point intervals of uppercase & lowercase letters less than 128 and will include the character.

use unicode_intervals::UnicodeCategory;

let intervals = unicode_intervals::query()
    .include_categories(
        UnicodeCategory::UPPERCASE_LETTER | 
        UnicodeCategory::LOWERCASE_LETTER
    )
    .max_codepoint(128)
    .include_characters("☃")
    .intervals()
    .expect("Invalid query input");
assert_eq!(intervals, &[(65, 90), (97, 122), (9731, 9731)]);

IntervalSet for index-like access to the underlying codepoints:

let interval_set = unicode_intervals::query()
    .max_codepoint(128)
    .interval_set()
    .expect("Invalid query input");
// Get 10th codepoint in this interval set
assert_eq!(interval_set.codepoint_at(10), Some('K' as u32));
assert_eq!(interval_set.index_of('K'), Some(10));

Query specific Unicode version:

use unicode_intervals::UnicodeVersion;

let intervals = UnicodeVersion::V11_0_0.query()
    .max_codepoint(128)
    .include_characters("☃")
    .intervals()
    .expect("Invalid query input");
assert_eq!(intervals, &[(0, 128), (9731, 9731)]);

Restrict the output to code points within a certain range:

let intervals = unicode_intervals::query()
    .min_codepoint(65)
    .max_codepoint(128)
    .intervals()
    .expect("Invalid query input");
assert_eq!(intervals, &[(65, 128)])

Include or exclude specific characters:

use unicode_intervals::UnicodeCategory;

let intervals = unicode_intervals::query()
    .include_categories(UnicodeCategory::PARAGRAPH_SEPARATOR)
    .include_characters("-123")
    .intervals()
    .expect("Invalid query input");
assert_eq!(intervals, &[(45, 45), (49, 51), (8233, 8233)])

Unicode version support

unicode-intervals supports Unicode 9.0.0 - 15.0.0.

License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Commit count: 34

cargo fmt