Vaporetto

Vaporetto is a fast and lightweight pointwise prediction based tokenizer.

Examples

```rust use std::fs::File; use std::io::{prelude::*, stdin, BufReader};

use vaporetto::{Model, Predictor, Sentence};

let mut f = BufReader::new(File::open("model.bin").unwrap()); let model = Model::read(&mut f).unwrap(); let mut predictor = Predictor::new(model).dictoverwrapsize(3);

for line in stdin().lock().lines() { let s = Sentence::fromraw(line.unwrap()).unwrap(); let s = predictor.predict(s); let toks = s.totokenized_string().unwrap(); println!("{}", toks); } ```

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.