I want to implement a vector search in my application but I am trying to understand the differences between a KnnSearch and using the script_score query. As far as I can see I only see downsides to the Knn Search:
KnnSearch is limited to max 10.000 results (k param) and script_score is not.
KnnSearch Returns different results almost each call. Not that a difference much but with ~6000 matches it varies between 5900 - 6100.
KnnSearch should be faster since it is approximate but the tests I never noticed a significant change. Both queries execute in max 500ms. (Currently with 51.000.000 documents which contain a vector.)
The documentation says that using script_score is a brute-force manner to execute the similar search and that it should be significant slower but I don't really see a big difference.
Yes main difference is one is brute and another is approx using hnsw. Brute force requires evaluating the similarity for each vector in the data-set. It depends on your use-case but I would also say that whilst it appears to be a similar time, it may not be the case in production with parallel queries.
Due to it being approx, its not always guarantee to find the exact nearest neighbour or in the same order. the accuracy will not be equal to brute force.
I would also mention that as the number of vectors scale, you may consider storing vectors using quantisation, reducing the dimensionality of vectors for the cost of less accuracy and still use brute. That can improve memory footprint and allow for brute to be used
At what number of vectors do you expect a real difference in execution time? Or does rate of queries matter? For now, executing the query in Kibana there is no difference.
And my apologies but I still don't really understand why I should not always use script_score. As far as I can see it is much better.
At what number of vectors do you expect a real difference in execution time? Or does rate of queries matter? For now, executing the query in Kibana there is no difference.
It depends on the dims of your vector, number of queries you're serving per second and the memory allocation you have on your node. Executing a single query and comparing between both isn't a fair assessment.
For example in this article you can see that the number of requests per second between approx and exact greatly differ. We have made improvements to the performance of KNN since aswell, so these numbers but be even greater.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.