Elastic Heap Size issue

kapoorvarun23 · May 28, 2018, 1:18pm

Hi,

I have elastic running on 2 nodes with 32 core, 16 core and 64 GB, 32 GB ram respectively. Heap allocated to elastic node 1 is 24 GB (node with 64 GB ram).
Heap allocated to elastic node 2 is 15 GB (node with 32 GB ram).

There is no segregation for Data node and Master node as of now.

Currently data size is 500 GB (including both nodes)

ElasticSearch Version: 2.3.0

Problem:

Heap memory unexpectedly starts increasing on 1 node. And never goes down until I restart that particular node.
Increase in Heap memory does results in increase of heap on other node, but this goes back to normal when node 1 is restarted.

Actions done so far:

Tried deleting data reducing it to half i.e. from 1TB to 500Gb
Cleared field cache
Tried changes below parameters
indices.cache.filter.size: 15%
index.merge.scheduler.max_thread_count: 1
index.translog.flush_threshold_size: 1gb
index.refresh_interval: 30s
indices.fielddata.cache.size: 20%
indices.breaker.request.limit: 40%
indices.breaker.total.limit: 70%
action.auto_create_index: true
indices.breaker.fielddata.limit: 45%

Below are the runtime params being used while starting elastic on node 1
-Xms24g -Xmx24g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Djna.nosys=true

Below is the output of /_cluster/stats?pretty=1&clear=true&indices=true when heap was almost full on node 1 (64 GB RAM)
{
"timestamp" : 1527512323521,
"cluster_name" : "temp1",
"status" : "green",
"indices" : {
"count" : 34,
"shards" : {
"total" : 148,
"primaries" : 75,
"replication" : 0.9733333333333334,
"index" : {
"shards" : {
"min" : 2,
"max" : 20,
"avg" : 4.352941176470588
},
"primaries" : {
"min" : 1,
"max" : 10,
"avg" : 2.2058823529411766
},
"replication" : {
"min" : 0.0,
"max" : 1.0,
"avg" : 0.9705882352941176
}
}
},
"docs" : {
"count" : 1214534951,
"deleted" : 6776708
},
"store" : {
"size_in_bytes" : 564568010389,
"throttle_time_in_millis" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 5958392,
"hit_count" : 132952,
"miss_count" : 5825440,
"cache_size" : 0,
"cache_count" : 14379,
"evictions" : 14379
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 3840,
"memory_in_bytes" : 2497665031,
"terms_memory_in_bytes" : 2233316015,
"stored_fields_memory_in_bytes" : 222013960,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 696000,
"doc_values_memory_in_bytes" : 41639056,
"index_writer_memory_in_bytes" : 23480688,
"index_writer_max_memory_in_bytes" : 4599208143,
"version_map_memory_in_bytes" : 3179664,
"fixed_bit_set_memory_in_bytes" : 0
},
"percolate" : {
"total" : 0,
"time_in_millis" : 0,
"current" : 0,
"memory_size_in_bytes" : -1,
"memory_size" : "-1b",
"queries" : 0
}
},
"nodes" : {
"count" : {
"total" : 2,
"master_only" : 0,
"data_only" : 0,
"master_data" : 2,
"client" : 0
},
"versions" : [ "2.3.0" ],
"os" : {
"available_processors" : 48,
"allocated_processors" : 48,
"mem" : {
"total_in_bytes" : 0
},
"names" : [ {
"name" : "Linux",
"count" : 2
} ]
},
"process" : {
"cpu" : {
"percent" : 5
},
"open_file_descriptors" : {
"min" : 3141,
"max" : 3434,
"avg" : 3287
}
},
"jvm" : {
"max_uptime_in_millis" : 20946108,
"versions" : [ {
"version" : "1.8.0_91",
"vm_name" : "Java HotSpot(TM) 64-Bit Server VM",
"vm_version" : "25.91-b14",
"vm_vendor" : "Oracle Corporation",
"count" : 2
} ],
"mem" : {
"heap_used_in_bytes" : 30288790016,
"heap_max_in_bytes" : 41561948160
},
"threads" : 565
},
"fs" : {
"total_in_bytes" : 2620321349632,
"free_in_bytes" : 1478245064704,
"available_in_bytes" : 1400792936448,
"spins" : "true"
},
"plugins" : [ {
"name" : "cloud-aws",
"version" : "2.3.0",
"description" : "The Amazon Web Service (AWS) Cloud plugin allows to use AWS API for the unicast discovery mechanism and add S3 repositories.",
"jvm" : true,
"classname" : "org.elasticsearch.plugin.cloud.aws.CloudAwsPlugin",
"isolated" : true,
"site" : false
}, {
"name" : "delete-by-query",
"version" : "2.3.0",
"description" : "The Delete By Query plugin allows to delete documents in Elasticsearch with a single query.",
"jvm" : true,
"classname" : "org.elasticsearch.plugin.deletebyquery.DeleteByQueryPlugin",
"isolated" : true,
"site" : false
}, {
"name" : "kopf",
"version" : "2.0.1",
"description" : "kopf - simple web administration tool for Elasticsearch",
"url" : "/_plugin/kopf/",
"jvm" : false,
"site" : true
} ]
}
}

warkolm · May 28, 2018, 8:41pm

Can you upgrade?

kapoorvarun23 · May 29, 2018, 5:43am

@warkolm Upgrade is an option. But will it solve this issue?
Don't we any other resolution for this?

Christian_Dahlqvist · May 29, 2018, 5:47am

Elasticsearch by default assumes all data nodes are equal so the smaller node is likely to be under more pressure.

kapoorvarun23 · May 29, 2018, 6:04am

@Christian_Dahlqvist Thanks for replying if that is the case why do my bigger node heap gets full from time to time while smaller node's heap is normal.

Its not like always one of the them is going down, its random.

Christian_Dahlqvist · May 29, 2018, 6:06am

Do you have monitoring installed so you can show how heap usage varies over time?

How full does it get (it is expected to get to about 75% full before GC kicks in)? Does it crash the node?

kapoorvarun23 · May 29, 2018, 6:17am

It reaches above 75% and never goes back to normal until I restart elastic on that node. And yes if I wait enough to see if it goes back to normal of his own then after sometime heap gets completely full and node crashes. I can see GC running continuously in logs when heap size reaches 75% keeping that node in halt state.

No monitoring installed as of now.

Christian_Dahlqvist · May 29, 2018, 6:53am

I noticed that you have been overriding some of the default values. How did you arrive at these values? What happens if you stick with the defaults?

warkolm · May 29, 2018, 8:07am

It'll give you a better idea of what's happening, as you can leverage the Monitoring functionality in X-Pack.

kapoorvarun23 · May 29, 2018, 9:35am

I changed those values while experimenting my own debugging skills to solve this issue. This issue was occuring at default values.

kapoorvarun23 · May 29, 2018, 9:36am

But X-Pack is paid.

warkolm · May 29, 2018, 9:42am

Parts of it are, but Monitoring is totally free - https://www.elastic.co/subscriptions

kapoorvarun23 · May 30, 2018, 9:16am

Anyone?

system · June 27, 2018, 9:16am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch data node JVM Running out of memory Elasticsearch	2	484	May 8, 2020
Elastic nodes large heap usage - "Data too large, data for [<http_request>] would be..." Elasticsearch	5	2448	September 18, 2019
Elastic Search heap alert Elasticsearch	13	1639	December 17, 2018
Running out of heap memory Elasticsearch Elasticsearch	15	577	December 9, 2020
Elastic-Search Heap issues Elasticsearch	2	809	January 3, 2019

Elastic Heap Size issue

Related topics