High read I/O and Load Average after upgrading to Elasticsearch 5.3.2

Amitm · July 20, 2017, 4:40pm

Hello,
We upgraded our Elasticsearch to 5.3.2 from 1.7.4 after a long process of migration.
A few hours after the upgrade, we started to have performance issues in our app.
We saw that 1 or 2 nodes (out of 12 data nodes) suffer from high read I/O and high Load Average (caused by the wait for disk reads). Restating the node (we have replicas) temporarily solving the issue in the specific node, but the problem just moves to another node.
We investigated merges and checked if it might be caused by a hot spot (that's what we found over the internet that might be related to our issue) but we didn't find anything that might shed some light on this.

We are running 12 c3.4xlarge data instances on AWS, using instance store as disk (~80G is used out of 320GB on each node), 30G ram (15GB RAM for Elasticsearch Heap) and 16 cores.
In addition, we are running 3 m4.large master nodes.
We have 1 replica for each shard, 45 indexes , total size including replicas is 800G.

IO Top -

Load Average (12 hours ago) -

danielmitterdorfer · July 28, 2017, 8:20am

Hi,

I'd start exploring the nodes stats and look for significant differences in e.g. merge times on affected and unaffected nodes. You can also issue several calls to the nodes hot threads API to capture thread dumps. This should reveal what is actually going on on the affected nodes.

Daniel

system · August 25, 2017, 8:22am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
High I/O usage on large Elasticsearch Instance Elasticsearch	9	2290	November 3, 2021
High Disk Read I/O in Elasticsearch Elasticsearch	1	2412	November 14, 2017
Elasticsearch 6.3 CPU load & Disk I/O increased Elasticsearch	5	1202	October 10, 2018
1 Node gets stuck with high load and 0% disk idle Elasticsearch	3	387	November 4, 2019
Only one node in cluster(6 nodes) slow query and high read I/O when querying Elasticsearch	7	1069	August 27, 2020

High read I/O and Load Average after upgrading to Elasticsearch 5.3.2

Related topics