High heap usage

Hi,

I'm a beginner in ELK topic and really need help! I have a cluster with 6 nodes (3 master, 1 client and 2 data nodes). The heap configuration of each node is as below:
master: heap 2g (half of physical RAM as recommended)
client: heap 8g
data heap 4g

The number of documents has been increasing recently and I see the heap memory of all nodes are increasing as well. One of the three master nodes failed-over when the heap reached 100%. So I have few things that need to be cleared:

  1. Do I need to restart the service on failed master node? Of course another has take over the role, but what happens to that failed node? Could it automatically be recovered? I ask this since it happens quite regularly, almost every 2-3 days
  2. How to reduce the heap used? I did disable swapping, reindexed to fewer primary shards but that does not help. I have about ~90GB data a day.

Thanks in advance

Which version of Elasticsearch are you using?

Hi,

I'm using ES version 6.5.1.

It's possible you are affected by https://github.com/elastic/elasticsearch/pull/36308 which was fixed in 6.5.3. Can you upgrade?

Thank you for reply! Actually I read the post but don't fully understand the concept. Could you please explain? And yes, I can upgrade if needed!

Thanks again.

The details are not hugely important, but the key missing line in 6.5.1 is this one. As it says in the issue, this caused

effectively a terrible memory leak

The version you're on is affected. Moreover it can particularly affect the master node. I would encourage you to upgrade to the latest version and monitor the situation to see if the problem persists.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.