Elastic Cluster Rejected Execution Exception

Hi,

I have the following cluster configuration :

Master -3 node (4 cpu, 20 gb ram)
coordinating -3 node (16 cpu, 98 GB,)
data- 5 node (28 cpu, 48 RAM)

And thread pool configuration is following
thread_pool:

search:
size: 64
queue_size: 500
min_queue_size: 10
max_queue_size: 1000
auto_queue_frame_size: 2000
target_response_time: 1s
http.enabled: true

We have some complex queries and there are regular updates on around 25% of data on the cluster.

I have observed that after the concurrency of approx 19k rpm, the cluster starts to reject the requests. I expect the traffic to increase further.

Please tell me what configuration changes can help me so that no requests are rejected with this cluster configuration.

Thanks.

Have you identified what is limiting performance? Is it disk I/O on the data nodes? Long or frequent GC on data or coordinating nodes? Is CPU saturated on some nodes? Is networking proving to be a bottleneck?

How much data do you have in the cluster? How many indices and shards? What type of complex queries are you running? How do you perform these regular updates?

Hi I am attaching the disk I/O stats as well as GC Logs Below:

Co-ordinating nodeData Node

Networking is not a bottleneck as what i found.

I have around 120GB of data with 6 indices out of which 1 is major with around 110GB of data which is queried frequently. All indices have 5 shards. Partial Updates are done throughout the day on the largest index.

CPU reaches upto 85% when load is high.

Sample Complex Query :
Sample Query