Elastic Heap Size issue


(Varun Kapoor) #1

Hi,

I have elastic running on 2 nodes with 32 core, 16 core and 64 GB, 32 GB ram respectively. Heap allocated to elastic node 1 is 24 GB (node with 64 GB ram).
Heap allocated to elastic node 2 is 15 GB (node with 32 GB ram).

There is no segregation for Data node and Master node as of now.

Currently data size is 500 GB (including both nodes)

ElasticSearch Version: 2.3.0

Problem:

  1. Heap memory unexpectedly starts increasing on 1 node. And never goes down until I restart that particular node.
  2. Increase in Heap memory does results in increase of heap on other node, but this goes back to normal when node 1 is restarted.

Actions done so far:

  1. Tried deleting data reducing it to half i.e. from 1TB to 500Gb
  2. Cleared field cache
  3. Tried changes below parameters
    indices.cache.filter.size: 15%
    index.merge.scheduler.max_thread_count: 1
    index.translog.flush_threshold_size: 1gb
    index.refresh_interval: 30s
    indices.fielddata.cache.size: 20%
    indices.breaker.request.limit: 40%
    indices.breaker.total.limit: 70%
    action.auto_create_index: true
    indices.breaker.fielddata.limit: 45%

Below are the runtime params being used while starting elastic on node 1
-Xms24g -Xmx24g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Djna.nosys=true

Below is the output of /_cluster/stats?pretty=1&clear=true&indices=true when heap was almost full on node 1 (64 GB RAM)
{
"timestamp" : 1527512323521,
"cluster_name" : "temp1",
"status" : "green",
"indices" : {
"count" : 34,
"shards" : {
"total" : 148,
"primaries" : 75,
"replication" : 0.9733333333333334,
"index" : {
"shards" : {
"min" : 2,
"max" : 20,
"avg" : 4.352941176470588
},
"primaries" : {
"min" : 1,
"max" : 10,
"avg" : 2.2058823529411766
},
"replication" : {
"min" : 0.0,
"max" : 1.0,
"avg" : 0.9705882352941176
}
}
},
"docs" : {
"count" : 1214534951,
"deleted" : 6776708
},
"store" : {
"size_in_bytes" : 564568010389,
"throttle_time_in_millis" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 5958392,
"hit_count" : 132952,
"miss_count" : 5825440,
"cache_size" : 0,
"cache_count" : 14379,
"evictions" : 14379
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 3840,
"memory_in_bytes" : 2497665031,
"terms_memory_in_bytes" : 2233316015,
"stored_fields_memory_in_bytes" : 222013960,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 696000,
"doc_values_memory_in_bytes" : 41639056,
"index_writer_memory_in_bytes" : 23480688,
"index_writer_max_memory_in_bytes" : 4599208143,
"version_map_memory_in_bytes" : 3179664,
"fixed_bit_set_memory_in_bytes" : 0
},
"percolate" : {
"total" : 0,
"time_in_millis" : 0,
"current" : 0,
"memory_size_in_bytes" : -1,
"memory_size" : "-1b",
"queries" : 0
}
},
"nodes" : {
"count" : {
"total" : 2,
"master_only" : 0,
"data_only" : 0,
"master_data" : 2,
"client" : 0
},
"versions" : [ "2.3.0" ],
"os" : {
"available_processors" : 48,
"allocated_processors" : 48,
"mem" : {
"total_in_bytes" : 0
},
"names" : [ {
"name" : "Linux",
"count" : 2
} ]
},
"process" : {
"cpu" : {
"percent" : 5
},
"open_file_descriptors" : {
"min" : 3141,
"max" : 3434,
"avg" : 3287
}
},
"jvm" : {
"max_uptime_in_millis" : 20946108,
"versions" : [ {
"version" : "1.8.0_91",
"vm_name" : "Java HotSpot(TM) 64-Bit Server VM",
"vm_version" : "25.91-b14",
"vm_vendor" : "Oracle Corporation",
"count" : 2
} ],
"mem" : {
"heap_used_in_bytes" : 30288790016,
"heap_max_in_bytes" : 41561948160
},
"threads" : 565
},
"fs" : {
"total_in_bytes" : 2620321349632,
"free_in_bytes" : 1478245064704,
"available_in_bytes" : 1400792936448,
"spins" : "true"
},
"plugins" : [ {
"name" : "cloud-aws",
"version" : "2.3.0",
"description" : "The Amazon Web Service (AWS) Cloud plugin allows to use AWS API for the unicast discovery mechanism and add S3 repositories.",
"jvm" : true,
"classname" : "org.elasticsearch.plugin.cloud.aws.CloudAwsPlugin",
"isolated" : true,
"site" : false
}, {
"name" : "delete-by-query",
"version" : "2.3.0",
"description" : "The Delete By Query plugin allows to delete documents in Elasticsearch with a single query.",
"jvm" : true,
"classname" : "org.elasticsearch.plugin.deletebyquery.DeleteByQueryPlugin",
"isolated" : true,
"site" : false
}, {
"name" : "kopf",
"version" : "2.0.1",
"description" : "kopf - simple web administration tool for Elasticsearch",
"url" : "/_plugin/kopf/",
"jvm" : false,
"site" : true
} ]
}
}


(Mark Walkom) #2

Can you upgrade?


(Varun Kapoor) #3

@warkolm Upgrade is an option. But will it solve this issue?
Don't we any other resolution for this?


(Christian Dahlqvist) #4

Elasticsearch by default assumes all data nodes are equal so the smaller node is likely to be under more pressure.


(Varun Kapoor) #5

@Christian_Dahlqvist Thanks for replying if that is the case why do my bigger node heap gets full from time to time while smaller node's heap is normal.

Its not like always one of the them is going down, its random.


(Christian Dahlqvist) #6

Do you have monitoring installed so you can show how heap usage varies over time?

How full does it get (it is expected to get to about 75% full before GC kicks in)? Does it crash the node?


(Varun Kapoor) #7

It reaches above 75% and never goes back to normal until I restart elastic on that node. And yes if I wait enough to see if it goes back to normal of his own then after sometime heap gets completely full and node crashes. I can see GC running continuously in logs when heap size reaches 75% keeping that node in halt state.

No monitoring installed as of now.


(Christian Dahlqvist) #8

I noticed that you have been overriding some of the default values. How did you arrive at these values? What happens if you stick with the defaults?


(Mark Walkom) #9

It'll give you a better idea of what's happening, as you can leverage the Monitoring functionality in X-Pack.


(Varun Kapoor) #10

I changed those values while experimenting my own debugging skills to solve this issue. This issue was occuring at default values.


(Varun Kapoor) #11

But X-Pack is paid.


(Mark Walkom) #12

Parts of it are, but Monitoring is totally free - https://www.elastic.co/subscriptions


(Varun Kapoor) #13

Anyone?


(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.