Vaporetto

Vaporetto is a fast and lightweight pointwise prediction based tokenizer.

Examples

```rust use std::fs::File; use std::io::Read;

use vaporetto::{Model, Predictor, Sentence};

let mut f = File::open("model.bin").unwrap(); let mut modeldata = vec![]; f.readtoend(&mut modeldata).unwrap(); let (model, ) = Model::readslice(&model_data).unwrap(); let predictor = Predictor::new(model, false).unwrap();

let s = Sentence::from_raw("火星猫の生態").unwrap(); let s = predictor.predict(s);

println!("{:?}", s.totokenizedvec().unwrap()); // ["火星", "猫", "の", "生態"] ```

Feature flags

The following features are disabled by default:

The following features are enabled by default:

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.