pinyin-parser-rs

Parses a string of pinyin syllables. Covers marginal cases such as , ŋ and ê.

Since pinyin strings in the wild does not necessarily conform to the standard, this parser offers two modes: strict and loose.

Strict mode: * forbids the use of breve instead of hacek to represent the third tone * forbids the use of IPA ɡ (U+0261) instead of g, and other such lookalike characters * allows apostrophes only before an a, an e or an o

Examples

rust use pinyin_parser::PinyinParser; assert_eq!( PinyinParser::strict("jīntiān") .into_iter() .collect::<Vec<_>>(), vec!["jīn", "tiān"] );

This parser supports the use of , ĉ, ŝ and ŋ, though I have never seen anyone use it. rust use pinyin_parser::PinyinParser; assert_eq!( PinyinParser::strict("Ẑāŋ").into_iter().collect::<Vec<_>>(), vec!["zhāng"] )

rust use pinyin_parser::PinyinParser; assert_eq!( // An apostrophe can come only before an `a`, an `e` or an `o` in strict mode, // but allowed here because it's loose PinyinParser::loose("Yīng'guó") .into_iter() .collect::<Vec<_>>(), vec!["yīng", "guó"] );