🦞 Crawdad: ChaRActer-Wise Double-Array Dictionary

Crates.io Documentation Build Status

Overview

Crawdad is a library of natural language dictionaries using character-wise double-array tries. The implementation is optimized for strings of multibyte-characters, and you can enjoy fast text processing on strings such as Japanese or Chinese.

For example, on a large Japanese dictionary of IPADIC+Neologd, Crawdad has a better time-space tradeoff than other Rust libraries.

The detailed experimental settings and other results are available on Wiki.

What can do

Data structures

Crawdad contains the two trie implementations:

License

Licensed under either of

at your option.

For softwares under bench/data, follow the license terms of each software.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.