Unicode-JP (Rust)

Build Status crates.io MIT licensed

Converters of troublesome characters included in Japanese texts. - Half-width-kana[半角カナ;HANKAKU KANA] -> normal Katakana - Wide-alphanumeric[全角英数;ZENKAKU EISU] <-> normal ASCII

If you need canonicalization of texts including Japanese, consider to use unicode_normalization crate at first. NFD, NFKD, NFC and NFKC can be used. This crate, however, works with you if you are in a niche such as a need of delicate control of Japanese characters for a restrictive character terminal.

Japanese have two syllabary systems Hiragana and Katakana, and Half-width-kana is another notation system of them. In the systems, there are two combinable diacritical marks Voiced-sound-mark and Semi-voiced-sound-mark. Unicode has three independent code points for each of the marks. In addition to it, we often use special style Latin alphabets and Arabic numbers called Wide-alphanumeric in Japanese texts. This small utility converts these codes each other.

API Reference

Example

Cargo.toml toml [dependencies] unicode-jp = "0.2.0"

src/main.rs ```rust extern crate kana; use kana::Kana;

fn main() { let k = Kana::init();

let s1 = "マツオ バショウ ア゚";
assert_eq!("マツオ バショウ ア ゚", k.half2kana(s1));
assert_eq!("マツオ バショウ ア゚", k.half2full(s1));

let s2 = "ひ゜ひ゛んは゛";
assert_eq!("ぴびんば", k.combine(s2));
assert_eq!("ひ ゚ひ ゙んは ゙", kana::vsmark2combi(s2));

let s3 = "#&Rust-1.6!";
assert_eq!("#&Rust-1.6!", kana::wide2ascii(s3));

} ```

Functions of kana crate:

Methods of kana::Kana struct:

TODO or NOT TODO