tiktoken-rs
Ready-made tokenizer library for working with GPT and tiktoken
cargo
sh
cargo add tiktoken-rs
Then in your rust code, call the API
rust
use tiktoken_rs::tiktoken::p50k_base;
let bpe = p50k_base().unwrap();
let tokens = bpe.encode_with_special_tokens("This is an example");
println!("Token count: {}", tokens.len());
See the examples in the repo for usecases.
If you encounter any bugs or have any suggestions for improvements, please open an issue on the repository.
This project is licensed under the MIT License.