Elasticsearch heap issues

mailanu · September 19, 2016, 3:31am

Hi,

We have a 15 node ES cluster with 10 data + 2 client + 3 dedicated master nodes.
Data nodes are allocated 20GB RAM and 10GB heap.
This cluster is to capture logs flowing through logstash.

The setup is to create 10 shards per index . We create indices once per hour grouped by different categories of logs. Roughly we end up with 1000 indices translating to 10,000 shards. We close indices older than 7 days and delete indices older than 10 days.

With the current setup, after few days we see heap utilization reaching to 90% on all data nodes making them do only GC and hence break the entire cluster. Usually this forces us to restart our data nodes.

What are all the possible reasons on why we would have our heap going so high ? What are all the metrics that we should monitor to help us and what could we do to avoid this scenario.

Any pointers is greatly appreciated.

Christian_Dahlqvist · September 19, 2016, 7:13am

Based on your description it sounds like you are generating far too many shards. Having a very large number of small shards can waste a lot of resources, e.g. heap and file handles, which has been discussed numerous times here in the forums. A reasonably common shard size for logging use cases is 5-50GB, so if your shards currently are a lot smaller than that I would recommend looking into reducing the number of shards per index, switch to daily indices and/or merge indices.

mailanu · September 22, 2016, 8:20pm

Hi Christian,

Thanks for your response. We are going to try reducing the shard size.

On the same point, currently when one of our data node (out of 10 ) goes down, around 8000 shards gets into UNASSIGNED state. Under this condition, no data gets indexed. The indexing requests gets timeouts. It appears the cluster gets busy in reallocating the shards rather than having new documents indexed. Is their any configuration that is recommended to balance the cluster load between redistributing the shards and indexing new documents?

Thanks,
Anusha.

Ravi_Shanker_Reddy · September 23, 2016, 4:07am

if you are bringing down the node forcefully then the "routing allocation" is available for you.

Topic		Replies	Views
Why is my heap usage always high? Elasticsearch	10	5060	July 5, 2017
ES2.0.2 - Heap near 100% and eventually Elasticsearch locks up Elasticsearch	8	1221	January 4, 2018
Why does heap usage keep approaching 100%? Elasticsearch	5	1536	July 6, 2017
Heap Problem Elasticsearch	4	799	July 5, 2017
35 shards but maxing out JVM heap Elasticsearch	12	4317	April 5, 2018

Elasticsearch heap issues

Related topics