Number of concurrent search requests on ElasticSearch cluster

My objective is to calculate 2 things
Q1: how many search requests can a elasticsearch cluster can work on at once
Q2: how many search requests can a elasticsearch cluster hold in it's queue before starting rejecting it (producting 429 - Too many requests)

Say elasticsearch cluster(v6.8) has

  • 8 data nodes of r5.xlarge instance i.e. 4 vCPU's.
  • 3 master nodes of m5.xlarge.search i.e. 4 vCPU's
  • 12 indices
  • each index has 16 primary and 16 replica shards

Please help with the approach on how one can go about answering the above questions given this cluster configuration? (Assume 1 search request hits all indices and shards)


My attempt at calculating it:

In the Elasticsearch documentation it is mentioned:

For count/search/suggest operations. Thread pool type is fixed with a size of int(( # of allocated processors * 3) / 2) + 1, and queue_size of 1000.

So, according to me each node has a

  • thread pool of 7 as number of processors is ((4*3)/2) + 1.
  • The queue size for the same is 1000.
  • This means number of threads available in the entire cluster = 8 (# data nodes) * 7 (# thread pool each node)= 56

Assuming 1 search request hits all indices in worst case

  • Each request will produce number of indices * number of shards number of requests in the search queue = 12 * 16 = 192
  1. So number of requests that can be worked upon at once = 56/192 ? (which looks wrong)
  2. Number of searches it can hold in queue before rejecting them =
    (num nodes * queue size per node)/(num of sub requests each request produces) = (8 * 1000)/192 ~ 41 requests (which seems quite less).
1 Like

This is very old and has been EOL for years. I would recommend upgrading to a newer version.

How large are these indices?

What is the average shard size?

Does each query target a single index? If not, how many indices does a query target on average?

Sounds about right. Each query will result in a lot of shards needing to be searched, which will fill up the queues. If you want to optimise for the number of parallel queries that can be handled, you generally want to query as few shards as possible and potentially increase the number of CPU cores to handle more work in parallel. Increasing CPU naturally assumes you have enough disk I/O performance for this not to be a bottleneck.

Thanks Christian for your reply.

How large are these indices?

  • Each index has about 80 million records, store.size ~200 gb, pri.store.size~100 gb

What is the average shard size?

  • Average Primary Shard Size would be 100/16 ~ 6.25 gb

Does each query target a single index? If not, how many indices does a query target on average?

  • I usually run my query on an alias which has 12 indexes under it.
    I am not sure how Elasticsearch handles it internally. Can you guide me how I can find how many indices does a query target on average?

This is quite small. I would recommend trying setting the number of primary shards to 6 to see how that affects latencies. If still acceptable you may try reducing it even futher, maybe down to 3. The optimum shard size will depend on the data and type of queries you are running, so you will need to test.

This means all shards will be queried, which will fill up the queues faster. If possible I would recommend trying to minimize the indices targeted.

I would also recommend ensuring that you are not suffering from disk I/O bottlenecks as this often is the limiting factor in a cluster.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.