100% cpu system time used on hdd data node

wangxr1985 · March 17, 2021, 2:43pm

The ES version is 7.11.1
We use 16C64G vm which has 4 physical hdd disks(striped lvm volume) as warm data node.

The issue is that the cpu system time of a random vm offen suddenly rises up to 100%, and then the vm keeps hanging until it leaves the cluster.

I use top and pidstat to confirm that the process is elasticsearch, and "perf top" shows like this:

71.51%  [kernel]                      [k] __pv_queued_spin_lock_slowpath
       1.75%  [kernel]                      [k] _raw_spin_lock_irqsave
       1.42%  [kernel]                      [k] compact_checklock_irqsave.isra.24

or like this:

7.89%  [kernel]                      [k] isolate_freepages_block
   3.96%  [kernel]                      [k] __pv_queued_spin_lock_slowpath
   3.63%  [kernel]                      [k] copy_user_enhanced_fast_string
   1.75%  [kernel]                      [k] __list_del_entry

Is this a bug, or something else?

warkolm · March 22, 2021, 1:54am

What do your hot threads or slow logs or Elasticsearch logs show at this time?

wangxr1985 · March 22, 2021, 2:44am

I used to change the hostname of each node and reinstall ES from version 5.6.3 to version 7.11.1, and then add them to another cluster.
After rebooting the system 2 days ago, everything is ok now.I forgot to get the hot thread info.

system · April 19, 2021, 2:45am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
CPUs at 100%, but no disk I/O - ES 7.3 Elasticsearch	7	645	September 20, 2019
One ES Data node's CPU jumps to 90%+ suddenly while in production Elasticsearch	7	973	May 6, 2021
High CPU Usage in one node of cluster of 4 nodes Elasticsearch	1	1091	June 5, 2017
CPU usages 90% and ES hotthreads dump Elasticsearch	2	461	July 6, 2017
ES high CPU usage when idle Elasticsearch	5	10483	July 19, 2017

100% cpu system time used on hdd data node

Related topics