icu_uniset crates.io

icu_uniset is a utility crate of the [ICU4X] project.

This API provides necessary functionality for highly efficient querying of sets of Unicode characters.

It is an implementation of the existing ICU4C UnicodeSet API.

Architecture

ICU4X [UnicodeSet] is split up into independent levels, with [UnicodeSet] representing the membership/query API, and [UnicodeSetBuilder] representing the builder API. A Properties API is in future works.

Examples:

Creating a UnicodeSet

UnicodeSets are created from either serialized UnicodeSets, represented by inversion lists, the [UnicodeSetBuilder], or from the TBA Properties API.

```rust use icu_uniset::{UnicodeSet, UnicodeSetBuilder};

let mut builder = UnicodeSetBuilder::new(); builder.add_range(&('A'..'Z')); let set: UnicodeSet = builder.build();

assert!(set.contains('A')); ```

Querying a UnicodeSet

Currently, you can check if a character/range of characters exists in the [UnicodeSet], or iterate through the characters.

```rust use icu_uniset::{UnicodeSet, UnicodeSetBuilder};

let mut builder = UnicodeSetBuilder::new(); builder.add_range(&('A'..'Z')); let set: UnicodeSet = builder.build();

assert!(set.contains('A')); assert!(set.containsrange(&('A'..='C'))); asserteq!(set.iter_chars().next(), Some('A')); ```

More Information

For more information on development, authorship, contributing etc. please visit ICU4X home page.