JVM Memory Always at 50% and Above

rajivchodisetti · February 25, 2018, 6:45am

Hi,

We have a 3 master and 6 Data Node ES setup, where data node is of config 30 GB RAM and 8 Core and where 16GB RAM is allocated for heap and rest for file system cache (i.e Lucene index caching)

We use ES mostly as a time series database and we are not using druid because our use case involve updating the dimensional data i.e updating documents whenever there is change in the product data

Our Index is de-normalized as per the ES best practices where the same document will contain both product info and sale or other facts and we create one index for each month where each index has 2 replicas and 6 shards each and each monthly index is of 30GB in size and holds 10million documents each and we store 3 years worth of data like this and at service layer we have optimizations like , if someone is querying for 3 months worth of data, then the query which we fire to ES refer to only those 3 monthly indices

swap is set to off on all the Nodes, open files limit has been set to 131070 on all nodes and we run force merge on all newly created indices to reduce the number of segments to 6 on a week basis.

And now coming to the main problem, if you see the image attached, JVM memory usage on all ES Nodes is always 40% plus and the attached screenshot is taken when there aren't any loads or queries running on top of ES, not sure how to debug this problem ? So any pointers on how can I drill down to the root cause of the issue ?

One other important point is, we run a lot of scripted queries on ES and this is how those scripted queries looks like https://www.dropbox.com/s/qlfem4qs04zchpg/scripted_queries_es.json?dl=0

I can't make these inline scripts parameter driven because their definition isn't constant and will change depending on the filters being passed in the front end by the user, some requests may have 1 filter, some 2 and some "N" number of filters

Does these kind of scripted queries will have any negative impact on system performance ?

loren · February 25, 2018, 11:01pm

Does this show any fields or parent/child relationships taking up lots of memory?

GET /_nodes/stats/indices/fielddata?level=indices&fields=*&human&pretty

Does this show a large percentage of the 16GB heap in the old gen?

GET /_nodes/stats/jvm

rajivchodisetti · February 26, 2018, 2:21am

@loren GET /_nodes/stats/indices/fielddata?level=indices&fields=*&human&pretty , this (field data) is only about 132MB per node

But this one GET /_nodes/stats/jvm, shows most of the large percentage of heap in the old gen, have pasted the output of one node for reference

jvm
timestamp	1519611476408
uptime_in_millis	13696410800
mem
heap_used_in_bytes	10296816952
heap_used_percent	60
heap_committed_in_bytes	17110138880
heap_max_in_bytes	17110138880
non_heap_used_in_bytes	176787312
non_heap_committed_in_bytes	194248704
pools
young
used_in_bytes	455208768
max_in_bytes	558432256
peak_used_in_bytes	558432256
peak_max_in_bytes	558432256
survivor
used_in_bytes	4998528
max_in_bytes	69730304
peak_used_in_bytes	69730304
peak_max_in_bytes	69730304
old
used_in_bytes	9836609656
max_in_bytes	16481976320
peak_used_in_bytes	14988439840
peak_max_in_bytes	16481976320

loren · February 26, 2018, 7:26pm

Since 10G of your 16G heap is allocated to objects that are long lived (i.e. aren't going to get garbage collected), it makes sense that you don't see the JVM heap percentage drop below 60% even when the node is idle.

The next question is what is in the old gen. If you can't identify any obvious areas in your application (e.g., a plugin that creates an on-heap cache), then you could dump the JVM heap and inspect it manually (jmap/jconsole/jvisualvm) to get a better sense of the allocations.

I assume you are using the default jvm.options file with the only change being ms/mx=16GB?

loren · February 26, 2018, 11:59pm

I totally missed that you have 155 shards per data node. That's 1000 shards for the cluster.

Reduce that by one or two orders of magnitude and I think your JVM heap usage will be more sane.

rajivchodisetti · February 27, 2018, 3:12am

Oh, I was planning to add more indices and BTW is there any relationship between number of Nodes & shards, because for each index, I have configured 6 shards & 2 replicas as I have 6 machines in place thinking that all machines will participate in query execution

Also on the heap size, I have gone through this below blog earlier, where they were saying for each GB heap allocated, we can have 20 shards i.e for 16GB, we can have around 320 shards per Node

"TIP: The number of shards you can hold on a node will be proportional to the amount of heap you have available, but there is no fixed limit enforced by Elasticsearch. A good rule-of-thumb is to ensure you keep the number of shards per node below 20 to 25 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600-750 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health."

rajivchodisetti · February 27, 2018, 8:26am

I have cut down the number of shards by 50% by dropping all the UN-used indexes , but still the heap is at 40% even if nothing is running. Is it a common thing ?

loren · February 27, 2018, 5:31pm

I think so. As that article says, "the further below this limit you can keep it the better". I would shoot for fewer shards per node and see what effect it has on your heap usage. Use 1 shard for each index unless the index won't fit on a single node.

system · March 27, 2018, 5:31pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why the JVM heap of ES is always about 50%? Elasticsearch	17	2289	September 4, 2017
Memory usage unaccounted for? Elasticsearch	5	1103	July 5, 2017
Memory usage seems excessive Elasticsearch	3	324	July 6, 2017
2.3.3 heap used too high，jvm old gen percent 100%，not crash Elasticsearch	14	4541	March 15, 2017
ES JVM memory usage consistently above 90% Elasticsearch	11	11579	July 6, 2017

JVM Memory Always at 50% and Above

Related topics