One node very high load average , why? data node in cluster,16cpu 2.30GHz,48G memory,8G swap.have low cpu use,low memory use,but when i use top,can see very high 'load average',it be 40,sy high too,why ?

when i restart this node scondday it just fine,norm 'load average',norm sy,norm cpu use,norm memory,cry myself

and also,sometime data node well dead,so many non interruptible sleep state progress,and i can't use any like kill pkill ps command,force reboot is only i can do.

I have experienced similar thing before for a node running a long time.
It seems to be running with higher cpu for no reason.
Instead of rebooting the node, I simply restart the Elasticsearch service.
Then the cpu utilization drops down to the level "I believe" to be normal.
I came away with 2 possibilities:

  1. Minor bug in ES. Restarting the service clears whatever state put the node in high CPU.
  2. The short window when the service was restarting, some indices' primary shard got reassigned; therefore, the cluster becomes more balanced.

I'm leaning toward 1 because subsequently I had restarted several more nodes running more than 100 days and their CPU utilizations dropped as well. I couldn't remember which version it was.

But I have not seen such behavior with version 7.2 yet.

thanks bro,but i can't change my version :sob:

You should always run Elasticsearch without swap.

What type of storage do you have? Local SSDs? What does iostat look like?

Is a customer provided virtual machine,but i'm sure not Local SSDs,i'm already configed bootstrap.memory_lock: true,Will other programs read es cause this effect?

It's not that frequent for me back then. How long before you start to see this behavior? For me it's like 100+ days.
What you could do if you truly believe you are experiencing the same issue as I is to schedule restart of the service say every 6 months (if manual restart is not possible), etc.

I'll try it,thanks!

