Heap Size in Stack monitoring does not match jvm.options setting

Hello,

We have Prod/QA/Test identical clusters - 3 nodes (centos / 8core / 64g / 2tb ssd each).

Jvm.options Heap size (-Xmx) is not set in prod & qa. Its set to -Xmx32g in test cluster. However, Kibana shows different values on all three clusters. Stack monitoring in Prod shows 38.8gb, QA it is 95.8gb. Test cluster shows 26gb, even though we explicitly set it to 32gb.

How can I find the accurate value. Nodes in Prod have been recycling and we also see frequent Data Too Large exceptions. Cluster Overview shows several index recoveries and I am suspecting it has to do with the low jvm heap size. I appreciate your help and guidance.

Thanks & Regards,
Raj

Hi @ratnakarn Welcome to the community.

I usually don't reply with a "Read the docs" response but in this case I think it is really important for a couple reasons but mainly wrong or poor heap settings can dramatically negatively affect the performance and stability of Elasticsearch. I suggest you read this page.

Then come back and lets talk.

Hello Stephen,

  Thanks for the quick response.  I have read this link several times, and as its states, I set the heap size to 32 gb (50% of RAM recommendation) in test cluster.  But kibana shows only 26gb.

  In contrast, no explicit value was specified in prod & qa cluster jvm options. Based on what I read in the link, I expected the heap size to cap at 3gb (1gb x 3 nodes).  However, Kibana shows 38.8 in prod but 95.8 in QA.. 
 How does ES figure out what heap size to use? Why such huge difference between prod & qa?  Why did the prod heap size not increase to ~96gb (It gets significant traffic).  Since the displayed sizes don't match the settings in all 3 clusters, is there a different place we can look?  

I was suggesting Ops team to explicitly set the min / max heap sizes to 32gb, but they asked why is the test cluster kibana displaying 26gb even with the setting?

I googled & searched this forum, but haven't found an equivalent question. 

Thank you so much for your time.

Regards,
Raj

Hi @ratnakarn

We can only supply Best Practices and recommendations.

You should always set the min and max JVM heap size to the same value. Not setting the max size is not a supported configuration and is probably why you are seeing weird results in Kibana.

Yes the defaults are

-Xms1g
-Xmx1g

From above.

 In contrast, no explicit value was specified in prod & qa cluster jvm options. 

I am a little confused are min and max explicitly set or not?... either way the need to be specifically set.

If there was no changes to jvm.options then the nodes would have 1GB heap, but since it shows a different value most likely someone changed or removed the settings, that is the only way a node heap size would not be 1GB.

You can run this see the actual settings for each node
GET /_nodes

Next Topic 32GB RAM

  • Yes in general Set Xmx and Xms to no more than 50% of your physical RAM but that does not mean set 32GB

BUT this is really important!

  • Set Xmx and Xms to no more than the threshold that the JVM uses for compressed object pointers (compressed oops); the exact threshold varies but is near 32 GB. You can verify that you are under the threshold by looking for a line in the logs like the following:

  • Ideally set Xmx and Xms to no more than the threshold for zero-based compressed oops; the exact threshold varies but 26 GB is safe on most systems, but can be as large as 30 GB on some systems. You can verify that you are under this threshold by starting Elasticsearch with the JVM options -XX:+UnlockDiagnosticVMOptions -XX:+PrintCompressedOopsMode and looking for a line like the following:

So when you set your heap to 32GB you are going beyond zero-based compressed oops and actually making Elasticsearch less efficient. This is very important.

My suggestion as recommended is to set to 26GB to be safe if you do not know exactly where the zero based Compressed Oops are no longer supported.

-Xms26g
-Xmx26g

Hi Stephen,

Thank you so much for clarifying. It's clear now. I will try the suggestion in test cluster to identify the zero based compression threshold and We will explicitly set the min/max heap options.

Greatly appreciated.

Regards,
Raj

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.