Hi Folks,
I need some help understanding Query phase of distributed search in ES better, step 2 specifically.
Node 3
forwards the search request to a primary or replica copy of every shard in the index. Each shard executes the query locally and adds the results into a local sorted priority queue of sizefrom + size
.
I understand that each Node in the cluster has task specific thread pools to perform a task. Coordinating node accepts client request with one thread from search thread-pool. Similarly, when one of the data node receives a search request from coordinating node, one thread from it's search thread pool is utilized.
What does it mean each Shards executes the query locally mean? Does each shard have their own threads to perform the search? Currently, I imagine Shards as a block of storage space within it are Segments. The thread(T1) selected to handle the search request in the data node performs the search concurrently based on the max_concurrent_shard_requests
parameter. After multiple concurrent searches, T1 aggregates the result in a priority queue and sends it back to coordinating node.
I think I don't fully understand the connection between a Shard and the CPUs. If there is any, please help me understand. Also, please correct me if I misunderstood any other part in my explanation.
Thanks much