Rust bindings of Mozilla's DeepSpeech library.
The library is open source and performs Speech-To-Text completely offline. They provide pretrained models for English.
Preparation:
git clone https://github.com/mozilla/DeepSpeech
git checkout v0.1.0
util/taskcluster.py
to download precompiled components. See the native_client README for further options for that script and how to compile DeepSpeech yourself.https://github.com/mozilla/DeepSpeech/releases/download/v0.1.0/deepspeech-0.1.0-models.tar.gz
and extract the zip file to some location.LD_LIBRARY_PATH
and LIBRARY_PATH
environment variables.You can now invoke the example via:
cargo run --release --example client <path-to-model-dir> <path-to-audio-file>
It will print out the recognized sequence on stdout. The format of the audio files is important: only mono files are supported for now. The 0.1.0 release announcement has a detailed list of requirements. All codecs that the awesome audrey library supports are supported.
Note: Right now, there are no Linux x64 binaries for older CPU platforms without AVX2 support like Ivy Bridge. See DeepSpeech's 0.1.0 release announcement for a list of supported platforms.
As of writing this, there has been only been one release of the DeepSpeech library yet, version 0.1.0
. We only claim compatibility with that release.
We will try to provide compatibility with the most recent usable release possible.
Licensed under Apache 2 or MIT (at your option). For details, see the LICENSE file.
All examples inside the examples/
folder are licensed under the
CC-0 license.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed / CC-0 licensed as above, without any additional terms or conditions.