Periodic CPU Spikes on all nodes

I’ve gotten some periodic CPU spikes on servers when there are no significant load. Restarting the servers doesn’t help. I have 4 nodes with 3 set to master-capable and one data node.

When it spikes, all 4 servers run at max 100%

I’ve collected node stats for these here: https://gist.github.com/smlbiobot/f1ddd5761d945168964c413071210ee6

Any help would be greatly appreciated!

I know we have some threads data in the gist, I've had a quick look, I can see "peak_count": is above the thread count, could you share:

_cluster/health
_cat/thread_pool?v
_nodes/hot_threads

I will fetch those for you when I can, but I would like to add that I just saw this on the dashboard that I made for myself, when a spike occured

:

In the past I assumed that it had to do with increased number of indexing / fetching but it appears that it had to do with “refreshing”. Can you explain to me what that means?

This is an issue as the cluster serves a live site and when that occurs, the entire website will become unresponsive and then time out.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.