Hello,
If possible, I'd like to learn more specifically how shard ranking is calculated for a search.
The current documentation for Adaptive Replica Selection says
By default, Elasticsearch uses adaptive replica selection to route search requests. This method selects an eligible node using shard allocation awareness and the following criteria:
- Response time of prior requests between the coordinating node and the eligible node
- How long the eligible node took to run previous searches
- Queue size of the eligible node’s search threadpool
Do the first and second items have some overlap? Or is the first item the communication latency of the inter-node request and the second item a metric provided by the data node for the duration of search? Or are they something else that could be more precisely expressed? (Also, as stated, the first item seems to include latency of indexing requests, too. Is that the case?)
Are the first two items scoped down at all? That is, does the coordinating node have those criteria on a shard-by-shard or index-by-index basis, or is it for all requests sent to the candidate nodes?
If a coordinating node has determined that there are 8 participating shards for a given search, as it iterates through the shards determining which shard copy of each should be searched, does the selection for the first 7 influence the choice for the 8th (by, say, incrementing the apparent queue size for the destination nodes)? Or maybe it doesn't have to because those searches have been dispatched already and the search queue size will already reflect the new state?
Thanks in advance for any insight you can provide.