How can we know the time taken to merge search results returned by all shards?
I have enabled the search.slowlog.threshold.query.debug settings , but this gives shard level timings only. Even profile : true in query, returns shard level timings.
My questions:
[1] Can we relate query, rewrite and collector times to merge time ? how are they related?
[2] How exactly can we calculate merge time with best approximation or nearest/exact value?
Now, to your question, merge time isn't currently profiled. It's essentially just inserting values into a priority queue of size n, using the shard-local score as the sort value for the priority queue. I don't quite remember how ES implements it, but priority queues are usually implemented as heaps. So insertion/removal is probably O(log n)
I would not expect the merge time of hits to be a significant contributor to latency, especially relative to actually scoring documents and touching the disk. But it'd be nice if we could add it to the profiler eventually.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.