An implementation of the image matching algorithm described in this paper. The matching algorithm is designed to detect nearly identical images, not images with the same conceptual content.
By default, the library offers two primary functions: get_buffer_signature(rgba, width)
and cosine_similarity(a, b)
.
The former takes a pre-processed slice of u8
s with each chunk of four representing the 8-bit red, green, blue, and
alpha of a pixel, the latter two result vectors to compute their similarity. Per the source paper and our experiments
in this research images with a similarity greater than 0.6
can
be considered likely matches.
If the img
feature is used, also provided is get_image_signature(image)
which uses the
image library to handle unpacking the image into an rgba buffer. Both signature
functions also expose tuned
versions which allow tweaking the crop percentage used during the signature computation,
as well as the size of the collection grid which controls the length of the feature vector produced.
[-2, 2]
. It will likely require experimentation around a new suggested vector similarity cutoff.