ElasticSearch Read/Write performance issue on Peak Load

sahil441 · January 24, 2022, 10:48am

Context:

We are running ES Cluster (ES Version - "7.9.1") with 3 Master-Nodes & 8 data-nodes (AWS-"r5.2xlarge")
We are using the default ES thread pool configuration
There are 2 indices having 5 shards each (used by the given microservice) in which our data is stored, having 5 shards each
Another Microservices are utilizing the same ES Cluster (having their separate indices)

Issue:
At peak load, many read & write requests on ES are timed-out (client-side socket timeout is very high 12s)

Cause of Issue: At peak load, lots of read/write requests are getting queued up waiting for a worker thread from the corresponding thread pool, hence the latency

Solutions:

Increase to number of shards to at least the number of data nodes
Have a separate number Cluster for this service so that resources are not shared by other services

Questions:

Can we increase the number of search & write threads in the pool by reducing the size of some other thread pools for e.g 'sql-write (we don't use it) & reducing the core size of many dynamic pools to 0 (which we don't generally use) ?
Any other recommendations?

Christian_Dahlqvist · January 24, 2022, 11:16am

First try to identify what is the bottleneck. Indexing is quite I/O intensive, so I would first look at storage performance and iowait. What type of storage are you using? What is the load on it?

How many indices and shards are you actively indexing into (in the cluster as a whole).

DineshNaik · January 24, 2022, 12:42pm

What is the load (qps and wps) at which you start seeing this performance degradation?

sahil441 · January 26, 2022, 3:26pm

Storage Type is "Elastic Block Storage" And "EBS volume type" is SSD (general Purpose)

Read Latency:

Write Latency:

Read Throuput:

Write througput:

Disk Queue depth:

Read IOPS:

Write IOPS:

Past 2 weeks stats

system · February 23, 2022, 3:26pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES Write timeout Elasticsearch	28	5927	November 3, 2017
Maximize read/write throughput Elasticsearch	11	7249	October 1, 2019
Limited read and write speed using Elasticsearch Hadoop Elasticsearch es-hadoop	7	696	October 11, 2022
Newbie performance troubleshooting, high load spikes on ES nodes Elasticsearch	5	5058	June 11, 2018
TermsFIlter performance as index grows Elasticsearch	9	832	July 6, 2017

ElasticSearch Read/Write performance issue on Peak Load

Related topics