Slow aKNN search

Answering here, since I think it is more relevant in this thread instead of this one https://discuss.elastic.co/t/profiling-knn-search/327065/5

Just to quote Ben's answer in the other thread:

KNN spends most of its time in the rewrite. So, that is indeed the hot spot. Its where the KNN search occurs.
We do segment searches serially. So, comparing your two open KNN search tickets this is what I think is happening.
You are on a single node with a single shard. That single shard has 49 segments, each seems to be an OK size (at least a GB or so).
But, this then means, on a single node, you are exploring 49 different HNSW graphs.
In the future, we want to make KNN work in parallel on the same shard but with different segments, but right now, that doesn't happen.
I think you should try force-merging your test node to fewer segments. It doesn't have to be 1. 1 would be best, but it could take a while to complete.

I have finally managed to run a force merge on the test server (copy of production ES server, 11 million documents, 49 segments) and merged the index to 1 segment. It did take about 3 days to run (i ran in asynchronously) but the results were astounding!

In average the improvements in search speed were close to an order of magnitude - from about 30 seconds to 2-3 seconds.
So yes, it would be terrific if ES could parallelize segment search.

Do you think it makes sense to force merge the production index to 1 segment?
We can use our old full text TF-IDF search index that we also keep just in case for three days during the weeked when the load is not high and then once our dense vector index is merged we can switch back to it.
But will the number of segments eventually grow when new documents are indexed or old - deleted?