version6.8.0.one data node in cluster,16cpu 2.30GHz,48G memory,8G swap.have low cpu use,low memory use,but when i use top,can see very high 'load average',it be 40,sy high too,why ?
when i restart this node scondday it just fine,norm 'load average',norm sy,norm cpu use,norm memory,cry myself
and also,sometime data node well dead,so many non interruptible sleep state progress,and i can't use any like kill pkill ps command,force reboot is only i can do.
I have experienced similar thing before for a node running a long time.
It seems to be running with higher cpu for no reason.
Instead of rebooting the node, I simply restart the Elasticsearch service.
Then the cpu utilization drops down to the level "I believe" to be normal.
I came away with 2 possibilities:
- Minor bug in ES. Restarting the service clears whatever state put the node in high CPU.
- The short window when the service was restarting, some indices' primary shard got reassigned; therefore, the cluster becomes more balanced.
I'm leaning toward 1 because subsequently I had restarted several more nodes running more than 100 days and their CPU utilizations dropped as well. I couldn't remember which version it was.
But I have not seen such behavior with version 7.2 yet.
thanks bro,but i can't change my version
You should always run Elasticsearch without swap.
What type of storage do you have? Local SSDs? What does iostat look like?
Is a customer provided virtual machine,but i'm sure not Local SSDs,i'm already configed bootstrap.memory_lock: true，Will other programs read es cause this effect？
It's not that frequent for me back then. How long before you start to see this behavior? For me it's like 100+ days.
What you could do if you truly believe you are experiencing the same issue as I is to schedule restart of the service say every 6 months (if manual restart is not possible), etc.
I'll try it,thanks!