Significance of two phases - "query then fetch" with default number of shards as 1

Is there any significance for search to be executed in two phases - "query then fetch" when the default number of shards as 1 ( starting from 7.x ) ? leaving the cases of considering replicas

That is probably the only scenario where executing the query in two phases may not bring a lot of added benefits. I believe this is a quite rare scenario and likely one that generally performs quite well anyway, so I don't think it adds much overhead either. Optimizing this would thesefore likely bring very little benefit, but make the code more complex and difficult to maintain, which IMHO seems like a bad tradeoff.

well. if we see 7.x, default number of shards is '1' , this decision was taken considering majority deployments were small / deployments with defaults. On the same note, does not it make sense to disable the strategy of doing aggregation in multiple places ( at least , once in data node and once in coordinating node) in the default scenarios

I do not think you would gain much as the same amount of work still need to be done, so it would add complexity for virtually no gain. If you have a small cluster it is also likely that the node serving the request also holds the shard which means there is not even a network hop to avoid in most cases, especially if you also use suitable preference setting.

I do not think this quite theoretical discussion around a very rare case which is usually very fast anyway is very useful so will leave it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.