Elasticsearch data node JVM Running out of memory

Hello,

We have a cluster with 70 data nodes, 3 Masters and 2 Query nodes. JVM Data nodes Heap size is 30G - total ram 60G. On an average each data node has around 500 shards. We are seeing steady heap growth in data nodes and eventually becoming unresponsive. This will have a casacading effect on master and query nodes heap growing and becoming unresponsive. Looking at the node stats, I do not see field and filter caching taking much space. But old gen is filling up fast. Any pointers to identify what is causing the high heap usage and workarounds?

Elasticsearch version: 1.7
GC : CMS
JVM Args: /usr/bin/java -Xms30g -Xmx30g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/ip-11-1-0-214_elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/elasticsearch-1.7.6.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/esfiles/log -Des.default.path.data=/esfiles/data -Des.default.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch

Pastbin link for Node stats: https://pastebin.com/PXZ8jPpV

That is a very old version of Elasticsearch. I would recommend you upgrade.

Those node stats look fine as far as I can see. Were these taken after a restart?

I have not used version 1.7 in years but believe the indices stats API was available back then and could provide useful insights. What is the full output of this API?

What is the indexing and query load of this cluster? Are you monitoring heap usage so you can show a graph of how this changes over time?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.