1 Node gets stuck with high load and 0% disk idle

sanktanglia · October 6, 2019, 4:47pm

Im running a 3 node cluster of ES 6.5.4 in AWS on c5.xlarge nodes with 700 GB drives with 3000 provisioned iops. Our cluster holds almost 1 billion documents and can handle alot of search and index most of the time. Sometimes it gets in a very weird state and I havent been able to figure out what is going on. When this state occurs, we get alot of search latency in our application and in the es cluster. Load is high on the single node but there is no obvious sign of what it is doing. Index and search arent high on that node and neither is CPU. Ive searched tasks and hot threads and nothing obvious is there, though there are often more tasks on the bad node than others(im attaching the task list). No high cpu shown on hot threads. The most telling sign is that in AWS EC2 control panel, I can see that the bad node is at 0% disk idle, while the disks of the other 2 nodes are at 80-90% idle. https://pastebin.com/k6X00NQm https://imgur.com/a/xTBfKhE

sanktanglia · October 7, 2019, 4:34pm

sigh the issue is happening again, going to include dumps of stuff
tasks: https://pastebin.com/4CbXw7Cp
hot threads: https://pastebin.com/ywdG7CwJ
_cluster/pending_tasks: none

sanktanglia · October 7, 2019, 5:35pm

oh and the bad node is h9zeBGO

system · November 4, 2019, 5:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ElasticSearch Nodes go HIGH cpu Elasticsearch	3	908	August 2, 2017
High cpu load on only one node Elasticsearch	2	1097	December 23, 2017
Performance Issue with Elasticsearch 7.11.1 Cluster Elasticsearch	1	325	May 17, 2021
What causes high CPU load on ES-Storage Nodes? Elasticsearch	5	460	July 6, 2017
ES high CPU usage when idle Elasticsearch	5	10476	July 19, 2017

1 Node gets stuck with high load and 0% disk idle

Related topics