ALgorithm in ElasticSearch for similarity distances between 2 floating vectors

Before I start, I seriously have no idea what Elastic is and How it works. I am ML Engineer and has recently been assigned the task for Image - Image similarity. I developed a model which will compare the distances between the vectors provided by the model such as Manhattan, Euclidean, cosine etc. I could do it easily but the problem is that I have this huge data around 20M images and I can not store all the vectors in memory for starters and even if I do, it is practically impossible for me to compare with each and every unit as it'll be O(12M) comparisons.

So I want to know that are there any algorithms for Vectors searching just there is text search algo like Okapi BM25.

My vectors look like:
[0.2,0.1,0.04,......] etc. They can be any dimensional depending on the use case. But is there any algorithm which can get me top-k search results.

Elasticsearch provides dense_vector type for storing vectors, and vector functions to find the most similar vectors.

But be aware that the current search goes sequentially through all documents, and can be slow depending on the size of your collection and the number of vector dimensions.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.