finalfusion-utils
is a Rust crate offering various
functionalities to process and query embeddings.
finalfusion-utils
supports conversion between different
formats, quantization of embedding matrices, similarity and
analogy queries as well as evaluation on analogy datasets.
The following precompiled binaries can be found on the releases page:
x86_64-unknown-linux-gnu-mkl
: glibc Linux build, statically linked
against Intel MKL. This is the recommended build for Intel (non-AMD)
CPUs.x86_64-unknown-linux-musl
: static Linux build using the MUSL C
library. This binary does not link against a BLAS/LAPACK implementation
and therefore does not support optimized product quantization.universal-macos
: dynamic macOS build. Supports both the x86_64 and
ARM64 architectures. Linked against the Accelerate framework for
BLAS/LAPACK.cargo
finalfusion-utils
can be installed using an up-to-date Rust
toolchain, which can be installed with rustup.
With a valid Rust toolchain, the crate is most easily installed through
cargo
:
~~~shell $ cargo install finalfusion-utils ~~~
Typically, you will want to enable support for a BLAS/LAPACK library to speed up matrix multiplication and enable optimized product quantization support. In order to do so, run
~~~shell $ cargo install finalfusion-utils --features implementation ~~~
where implementation
is one of the following:
accelerate
: the macOS Accelerate framework.intel-mkl
: Intel MKL (downloaded and statically linked).intel-mkl-amd
: Intel MKL, preinstalled MKL libaries expected, override
CPU detection for AMD CPUs.netlib
: any compatible system BLAS/LAPACK implementation(s).openblas
: system-installed OpenBLAS. This option is discouraged,
unless the system OpenBLAS library is a single-threaded build with
locking. Otherwise, OpenBLAS' threading interacts badly with application
threads.finalfusion-utils
can also be built from source,
after cloning this repository execute the following
command in the directory to find the exectuable under
target/release/finalfusion
:
~~~shell $ cargo build --release ~~~
finalfusion-utils
is built as a single binary, the
different functionality is invoked through subcommands:
~~~shell
$ finalfusion convert -f fasttext -t finalfusion \ embeddings.bin embeddings.fifu
$ finalfusion convert -f word2vec -t finalfusion \ embeddings.w2v embeddings.fifu
$ finalfusion convert --help ~~~
~~~shell
$ finalfusion quantize -f finalfusion -q pq -a 1 \ embeddings.pq ~~~
~~~ shell
$ finalfusion similar -f finalfusion -k 15 \ embeddings.fifu
$ finalfusion analogy -f finalfusion -k 5 \ Berlin Deutschland Amsterdam embeddings.fifu ~~~
~~~shell
$ finalfusion compute-accuracy embeddings.fifu \ analogies.txt ~~~
~~~shell
$ finalfusion metadata embeddings.fifu \
metadata.txt ~~~
~~~shell
$ finalfusion bucket-to-explicit buckets.fifu \ explicit.fifu ~~~
~~~shell
$ finalfusion completions zsh ~~~