One query thread per Shard?

When a query hits a shard, is it true it is handled by a single thread ?
If so, is it worth to assign as many shard per cpu core on a node to increase concurrency and reduce the overall query execution time ?
Is such optimisation a best practice ?

1 Like

Yes, the processing of a query is single threaded against each shard. Multiple queries can however be run concurrently against the same shard, so assuming you have more than one concurrent query, you can still use multiple cores. The ideal shard count per node therefore depends on the use case, so I would recommend benchmarking it to see what is right for your use case.

Ok. Thanks.

As you said, whether or not it is a good idea depends on the use case.
The metrics use case is all about aggregations and dashboards showing multiple facets of the system. They are likely to send many queries at the same time hence achieving some concurrency on their own.

On the other hand, the logs use case is less concurrent as it is often made of a single query per user at a time. In this case concurrency is more a matter of how many concurrent users we have. Increasing the number of shards so each node is allocated more than one may bring some benefits.

For metrics and logging use cases you generally end up using time-based indices, which means that the shard count is almost always considerably larger than the number of cores on a server. As each shard comes with some overhead in terms of heap usage and file handles, having too many shards can also cause problems and be inefficient.

Indeed. We currently have one index per day, each configured with 3 shards and 1 replica.
The metrics indexes have about 240m documents for a total size (including replica) of 45Gb).
The logs indexes have about 90m documents for a total size (including replica) of 25Gb).
The ES nodes have 32Gb RAM of which 15Gb is dedicated to the ES heap.

Things are usually ok for metrics since they often address a relatively small timeframe from one hour to a couple. As we said earlier each query is likely to be executed by a single thread on each node. However there are many requests in // hence achieving descent concurrency and overall response time.

However, logs analytics is sometimes a bit more heavy - mainly because of users building crazy queries. If the time period is less than a day, we could benefit from having more than one shard per node.

Or is it because my shards are becoming too large?

Shard size does affect query speed, which is why it is generally recommended to benchmark in order to find the ideal shard size as described in this video.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.