MinHash-rs

Build status Crates.io Documentation

A Rust implementation of MinHash trying to be parsimonious with memory.

What is MinHash?

MinHash is a probabilistic data structure used to estimate the similarity between two sets. It is based on the observation that if we hash two sets of objects, the probability that the hashes agree is equal to the Jaccard similarity between the two sets.