This library provides a way to search for Unicode code point intervals by categories, ranges, and custom character sets.
The main purpose of unicode-intervals
is to simplify generating strings that matching specific criteria.
toml
[dependencies]
unicode-intervals = "0.1"
The example below will produce code point intervals of uppercase & lowercase letters less than 128 and will include the ☃
character.
```rust use unicode_intervals::UnicodeCategory;
let intervals = unicodeintervals::query() .includecategories(UnicodeCategory::UPPERCASELETTER | UnicodeCategory::LOWERCASELETTER) .maxcodepoint(128) .includecharacters("☃") .intervals() .expect("Invalid query input"); assert_eq!(intervals, &[(65, 90), (97, 122), (9731, 9731)]); ```
IntervalSet
for index-like access to the underlying codepoints:
```rust use unicode_intervals::UnicodeCategory;
let intervalset = unicodeintervals::query() .maxcodepoint(128) .intervalset() .expect("Invalid query input"); // Get 10th codepoint in this interval set asserteq!(intervalset.codepointat(10), Some('K' as u32)); asserteq!(intervalset.indexof('K'), Some(10)); ```
Query specific Unicode version:
```rust use unicode_intervals::UnicodeVersion;
let intervals = UnicodeVersion::V1100.query() .maxcodepoint(128) .includecharacters("☃") .intervals() .expect("Invalid query input"); assert_eq!(intervals, &[(0, 128), (9731, 9731)]); ```
Restrict the output to code points within a certain range:
rust
let intervals = unicode_intervals::query()
.min_codepoint(65)
.max_codepoint(128)
.intervals()
.expect("Invalid query input");
assert_eq!(intervals, &[(65, 128)])
Include or exclude specific characters:
rust
let intervals = unicode_intervals::query()
.include_categories(UnicodeCategory::PARAGRAPH_SEPARATOR)
.include_characters("☃-123")
.intervals()
.expect("Invalid query input");
assert_eq!(intervals, &[(45, 45), (49, 51), (8233, 8233), (9731, 9731)])
unicode-intervals
supports Unicode 9.0.0 - 15.0.0.
Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.