100% CPU after upgrade (7.1.1 to 7.3)

icheishvili · August 6, 2019, 11:30pm

After upgrading to 7.3 over the weekend, I now have a node that constantly sits at full CPU utilization. _nodes/hot_threads is empty. The cluster has 25 indices, 250 total shards, and is made of up 3 machines, with each machine having 2 cores and 8gb of memory.

Replacing the high cpu-using node with a new machine did not fix the situation; high cpu usage came back after rebalance. Are there any known steps to fix or this is something new that was introduced in 7.3?

DavidTurner · August 7, 2019, 6:00am

This is surprising, particularly since hot threads is empty. Could you share the full output of the following, using something like https://gist.github.com since it will be quite large.

GET _nodes/hot_threads?threads=99999&ignore_idle_threads=false

Another possibility is that it's busy doing GC, which won't show up in the hot threads. Can you share the last thousand lines or so of the GC log too?

icheishvili · August 7, 2019, 11:51am

Here is the hot threads output you asked for: https://gist.github.com/icheishvili/3e7cd9382ae34c616df9e601f4771751

And here is the last 1000 lines of gc.log: https://gist.github.com/icheishvili/a8075376002ced072bfec2e8e3febebe

From what I can tell, GC behavior on all 3 nodes is quite similar; what caused me to check is seeing Young Allocation Failures when reading the log so I went to confirm, but happy to post more gc logs to show this.

icheishvili · August 8, 2019, 3:09pm

The misbehaving node has gotten worse and worse (up to a load avg of 20) and this has made our entire deployment unstable so we are being forced to revert back to 7.1.1. I would advise anyone reading this to carefully test 7.3.0 in their environment/traffic pattern or avoid it entirely.

system · September 5, 2019, 3:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch 7.3 CPU Usage Elasticsearch	3	868	September 18, 2019
High CPU on some nodes in the cluster Elasticsearch	7	422	July 6, 2017
Hot node Elasticsearch	5	963	July 5, 2017
Help understanding hot threads (CPU 100% for several hours) Elasticsearch	2	1187	November 12, 2018
Performance Issue with Elasticsearch 7.11.1 Cluster Elasticsearch	1	325	May 17, 2021

100% CPU after upgrade (7.1.1 to 7.3)

Related topics