This is a naive implementation of Word2Vec implemented in Rust.
The goal is to learn the basic principles and formulas behind Word2Vec. BTW, it's slow ;)
This lib is available as a lib and as a binary.
``` A naive Word2Vec implementation
USAGE: sloword2vec [SUBCOMMAND]
FLAGS: -h, --help Prints help information -V, --version Prints version information
SUBCOMMANDS: add-subtract Given a number of words to add and to subtract, returns a list of words in that area. help Prints this message or the help of the given subcommand(s) similar Given a path to a saved Word2Vec model and a target word, finds words in the model's vocab that are similar. train Given a corpus and a path to save a trained model, trains Word2Vec encodings for the vocabulary in the corpus and saves it. ```
``` Given a corpus and a path to save a trained model, trains Word2Vec encodings for the vocabulary in the corpus and saves it.
USAGE:
sloword2vec train [OPTIONS] --corpus
FLAGS: -h, --help Prints help information -V, --version Prints version information
OPTIONS:
-A, --acceptable-error
``` Given a path to a saved Word2Vec model and a target word, finds words in the model's vocab that are similar.
USAGE:
sloword2vec similar --limit
FLAGS: -h, --help Prints help information -V, --version Prints version information
OPTIONS:
-L, --limit
The classic demo of Word2Vec..
``` Given a number of words to add and to subtract, returns a list of words in that area.
USAGE:
sloword2vec add-subtract --add
FLAGS: -h, --help Prints help information -V, --version Prints version information
OPTIONS:
-A, --add
Pretty much the most naive implementation of Word2Vec, the only special thing being the use of matrix/vector maths to speed things up.
The linear algebra library behind this lib is ndarray
, with
OpenBlas enabled (Fortran and transparent multithreading FTW!).