[Homepage] [Document] [Examples]
Hora Search Everywhere!
Hora, a approximate nearest neighbor search algorithm library. We implement all code in Rust🦀
for reliability, high level abstraction and high speed comparable to C++
.
Hora, ã»ã‚‰
in Japanese, sound like [hÅlÉ™]
, means Wow
, You see!
or Look at that!
. The name is inspired by a famous Japanese song 「å°ã•ãªæ‹ã®ã†ãŸã€
.
Performant âš¡ï¸
Multiple Languages Support ☄ï¸
Python
Javascript
Java
Go
(WIP)Ruby
(WIP)Swift
(WIP)R
(WIP)Julia
(WIP)Multiple Indexes Support 🚀
Portable 💼
no_std
(WIP, partial)Windows
, Linux
and OS X
IOS
and Android
(WIP)BLAS
Reliability 🔒
Rust
compiler secure all codeRust
for all language libs such as Python lib
Multiple Distances Support 🧮
Dot Product Distance
Euclidean Distance
Manhattan Distance
Cosine Similarity
Productive â
Rust
in Cargo.toml
toml
[dependencies]
hora = "0.1.0"
Python
Bash
$ pip install hora
Building from source
bash
$ git clone https://github.com/hora-search/hora
$ cargo build
by aws t2.medium (CPU: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz)
more information
Rust
example [[more info](https://github.com/hora-search/hora/tree/main/examples)]
```Rust use hora::core::annindex::ANNIndex; use rand::{threadrng, Rng}; use rand_distr::{Distribution, Normal};
pub fn demo() { let n = 1000; let dimension = 64;
// make sample points
let mut samples = Vec::with_capacity(n);
let normal = Normal::new(0.0, 10.0).unwrap();
for _i in 0..n {
let mut sample = Vec::with_capacity(dimension);
for _j in 0..dimension {
sample.push(normal.sample(&mut rand::thread_rng()));
}
samples.push(sample);
}
// init index
let mut index = hora::index::hnsw_idx::HNSWIndex::<f32, usize>::new(
dimension,
&hora::index::hnsw_params::HNSWParams::<f32>::default(),
);
for (i, sample) in samples.iter().enumerate().take(n) {
// add point
index.add(sample, i).unwrap();
}
index.build(hora::core::metrics::Metric::Euclidean).unwrap();
let mut rng = thread_rng();
let target: usize = rng.gen_range(0..n);
// 523 has neighbors: [523, 762, 364, 268, 561, 231, 380, 817, 331, 246]
println!(
"{:?} has neighbors: {:?}",
target,
index.search(&samples[target], 10) // search for k nearest neighbors
);
} ```
Python
exmaple [[more info](https://github.com/hora-search/hora-python)]
```Python import numpy as np from hora import HNSWIndex
dimension = 50 n = 1000 index = HNSWIndex(dimension, "usize") # init index instance samples = np.float32(np.random.rand(n, dimension)) for i in range(0, len(samples)): index.add(np.float32(samples[i]), i) # add node index.build("euclidean") # build index target = np.random.randint(0, n) print("{} has neighbors: {}".format( target, index.search(samples[target], 10))) # search
```
Javascript
example [[more info](https://github.com/hora-search/hora-wasm)]
```JavaScript const demo = () => { const dimension = 50;
var bfidx = horawasm.BruteForceIndexUsize.new(dimension); for (var i = 0; i < 1000; i++) { var feature = []; for (var j = 0; j < dimension; j++) { feature.push(Math.random()); } bfidx.add(feature, i); // add point } bfidx.build("euclidean"); // build index var feature = []; for (var j = 0; j < dimension; j++) { feature.push(Math.random()); } console.log("bf result", .search(feature, 10)); //bf result Uint32Array(10)Â [704, 113, 358, 835, 408, 379, 117, 414, 808, 826] } ```
Java
examples [[more info](https://github.com/hora-search/hora-java)]
```Java public void demo() { final int dimension = 2; final float variance = 2.0f; Random fRandom = new Random();
BruteForceIndex bruteforce_idx = new BruteForceIndex(dimension); // init index instance
List<float[]> tmp = new ArrayList<>();
for (int i = 0; i < 5; i++) {
for (int p = 0; p < 10; p++) {
float[] features = new float[dimension];
for (int j = 0; j < dimension; j++) {
features[j] = getGaussian(fRandom, (float) (i * 10), variance);
}
bruteforce_idx.add("bf", features, i * 10 + p); // add point
tmp.add(features);
}
}
bruteforce_idx.build("bf", "euclidean"); // build index
int search_index = fRandom.nextInt(tmp.size());
// nearest neighbor search
int[] result = bruteforce_idx.search("bf", 10, tmp.get(search_index));
// [main] INFO com.hora.app.ANNIndexTest - demo bruteforce_idx[7, 8, 0, 5, 3, 9, 1, 6, 4, 2]
log.info("demo bruteforce_idx" + Arrays.toString(result));
}
private static float getGaussian(Random fRandom, float aMean, float variance) { float r = (float) fRandom.nextGaussian(); return aMean + r * variance; } ```
R
mmap
Hora
's implementation is strongly inspired by these lib.Faiss
focus more on the GPu scenerio, and Hora
is lighter than Faiss, such as no heavy dependency.Hora
expects to support more language, and everything related to performance shall be implemented by Rust🦀.Annoy
only support LSH(Random Projection)
algorithm.ScaNN
and Faiss
are less user-friendly, such as lack of document.Milvus
and Vald
also support multiple languages, but serve as a service instead of a libMilvus
is built upon some libs such as Faiss
, while Hora
is an algorithm lib with all the algo implemented by itselfWe appreciate your help!
we are pretty gald to have you to participate, any contributions is welcome, including the documentations and tests.
you can do the Pull Requests
, Issue
on the github, and we will review it as soon as possible.
We use GitHub issues for tracking suggestions and bugs.
```bash
git clone https://github.com/hora-search/hora
cargo build
cargo test --lib ```
bash
cd exmaples
cargo run
The entire repo is under Apache License.