Number of concurrent search requests on ElasticSearch cluster

Mohit_Munjal · July 25, 2023, 5:43am

My objective is to calculate 2 things
Q1: how many search requests can a elasticsearch cluster can work on at once
Q2: how many search requests can a elasticsearch cluster hold in it's queue before starting rejecting it (producting 429 - Too many requests)

Say elasticsearch cluster(v6.8) has

8 data nodes of r5.xlarge instance i.e. 4 vCPU's.
3 master nodes of m5.xlarge.search i.e. 4 vCPU's
12 indices
each index has 16 primary and 16 replica shards

Please help with the approach on how one can go about answering the above questions given this cluster configuration? (Assume 1 search request hits all indices and shards)

My attempt at calculating it:

In the Elasticsearch documentation it is mentioned:

For count/search/suggest operations. Thread pool type is fixed with a size of int(( # of allocated processors * 3) / 2) + 1, and queue_size of 1000.

So, according to me each node has a

thread pool of 7 as number of processors is ((4*3)/2) + 1.
The queue size for the same is 1000.
This means number of threads available in the entire cluster = 8 (# data nodes) * 7 (# thread pool each node)= 56

Assuming 1 search request hits all indices in worst case

Each request will produce number of indices * number of shards number of requests in the search queue = 12 * 16 = 192

So number of requests that can be worked upon at once = 56/192 ? (which looks wrong)
Number of searches it can hold in queue before rejecting them =
(num nodes * queue size per node)/(num of sub requests each request produces) = (8 * 1000)/192 ~ 41 requests (which seems quite less).

Christian_Dahlqvist · July 25, 2023, 5:53am

This is very old and has been EOL for years. I would recommend upgrading to a newer version.

How large are these indices?

What is the average shard size?

Does each query target a single index? If not, how many indices does a query target on average?

Sounds about right. Each query will result in a lot of shards needing to be searched, which will fill up the queues. If you want to optimise for the number of parallel queries that can be handled, you generally want to query as few shards as possible and potentially increase the number of CPU cores to handle more work in parallel. Increasing CPU naturally assumes you have enough disk I/O performance for this not to be a bottleneck.

Mohit_Munjal · July 25, 2023, 6:10am

Thanks Christian for your reply.

How large are these indices?

Each index has about 80 million records, store.size ~200 gb, pri.store.size~100 gb

What is the average shard size?

Average Primary Shard Size would be 100/16 ~ 6.25 gb

Does each query target a single index? If not, how many indices does a query target on average?

I usually run my query on an alias which has 12 indexes under it.
I am not sure how Elasticsearch handles it internally. Can you guide me how I can find how many indices does a query target on average?

Christian_Dahlqvist · July 25, 2023, 6:22am

This is quite small. I would recommend trying setting the number of primary shards to 6 to see how that affects latencies. If still acceptable you may try reducing it even futher, maybe down to 3. The optimum shard size will depend on the data and type of queries you are running, so you will need to test.

This means all shards will be queried, which will fill up the queues faster. If possible I would recommend trying to minimize the indices targeted.

I would also recommend ensuring that you are not suffering from disk I/O bottlenecks as this often is the limiting factor in a cluster.

system · August 22, 2023, 6:22am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index Thread Pools Elasticsearch	2	377	June 9, 2023
Concurrent search request to elasticsearch Elasticsearch	7	23026	July 6, 2017
Search Requests - How threads work Elasticsearch	2	1382	July 6, 2017
Parallelism Per Request Elasticsearch	2	676	August 30, 2018
How to increase the thread pool size Elasticsearch	7	58	August 23, 2024

Number of concurrent search requests on ElasticSearch cluster

Related topics