JVM heap size usage and causes

kernelpanic · August 27, 2019, 1:55pm

Hello, can someone explain to me what are the main causes of heap usage? Is it number of queries being run, the number of indexes / shards that are open or is it data ingestion?
I ask because I have a 3 node cluster and it seems no matter how much heap memory I give them, the nodes always creep north of 75% heap usage which will obviously trigger garbage collection and eventually a node drops out the cluster.

rugenl · August 27, 2019, 3:05pm

I've observed heap problems at 75% caused by too much data, when we got close to 10Tb of data on the nodes, they became overloaded. I froze older indices to get to about 9Tb of "thawed" data and it's much better. These were cold nodes, so no ingest and very little searching.

A happy node has a heap graph with big saw teeth. As heap is stressed, the teeth get smaller and closer together as GC becomes "unproductive", it doesn't find much garbage to toss out.

Just the existence of (non-frozen or closed) indices on a node takes heap. How much data do you have on your nodes? How many indices and segments?

kernelpanic · August 27, 2019, 3:25pm

Hello thanks for replying, I currently have 641 indexes, I don't think I could fit the segment stats for each index in this post.

However here's some other interesting stats:

GET /_cat/allocation?v

shards disk.indices disk.used disk.avail disk.total disk.percent
1382 7.4tb 5.3tb 7.6tb 13tb 41
1382 7.8tb 5.5tb 7.4tb 13tb 43
1382 7.5tb 5.4tb 7.5tb 13tb 41

GET /_cat/nodes?v&h=name,heapPercent

name heapPercent
v-uhsm-elasticsearch-n1 92
v-uhsm-elasticsearch-n2 84
v-uhsm-elasticsearch-n3 92

rugenl · August 27, 2019, 4:09pm

In _node/stats, for each node there is a segments section, what are the values for "count"? What is the heap size?

If segments is many times larger than shards, forcemerge may help.

Len Rugen

rugenl at yahoo.com

kernelpanic
FreeBSD user

    August 27

Hello thanks for replying, I currently have 641 indexes, I don't think I could fit the segment stats for each index in this post.

However here's some other interesting stats:

kernelpanic · August 27, 2019, 4:15pm

Thanks again:

Node 1:

segments": {
          "count": 21634,
          "memory_in_bytes": 14150554620,
          "terms_memory_in_bytes": 10023564419,
          "stored_fields_memory_in_bytes": 2987702240,
          "term_vectors_memory_in_bytes": 0,
          "norms_memory_in_bytes": 52787456,
          "points_memory_in_bytes": 1050099473,
          "doc_values_memory_in_bytes": 36401032,
          "index_writer_memory_in_bytes": 153624488,
          "version_map_memory_in_bytes": 0,
          "fixed_bit_set_memory_in_bytes": 0,
          "max_unsafe_auto_id_timestamp": 1566864003211,
          "file_sizes": {}

Node 2:

"segments": {
          "count": 21641,
          "memory_in_bytes": 13972633186,
          "terms_memory_in_bytes": 9901402778,
          "stored_fields_memory_in_bytes": 2936416784,
          "term_vectors_memory_in_bytes": 0,
          "norms_memory_in_bytes": 51125440,
          "points_memory_in_bytes": 1047283156,
          "doc_values_memory_in_bytes": 36405028,
          "index_writer_memory_in_bytes": 147712124,
          "version_map_memory_in_bytes": 0,
          "fixed_bit_set_memory_in_bytes": 0,
          "max_unsafe_auto_id_timestamp": 1566864003211,
          "file_sizes": {}
        },

Node 3:

 "segments": {
          "count": 21507,
          "memory_in_bytes": 14546042160,
          "terms_memory_in_bytes": 10303648462,
          "stored_fields_memory_in_bytes": 3095808520,
          "term_vectors_memory_in_bytes": 0,
          "norms_memory_in_bytes": 53268096,
          "points_memory_in_bytes": 1056658686,
          "doc_values_memory_in_bytes": 36658396,
          "index_writer_memory_in_bytes": 144191344,
          "version_map_memory_in_bytes": 0,
          "fixed_bit_set_memory_in_bytes": 0,
          "max_unsafe_auto_id_timestamp": 1566864003211,
          "file_sizes": {}
        },

Christian_Dahlqvist · August 27, 2019, 5:21pm

Have a look at this webinar which discusses heap usage and how it related to stored data volume.

In summary I would recommend the following:

Optimize your mappings
Forcemerge indices no longer written to down to a single segment. This is very I/O intensive but can reduce heap usage substantially.
Consider whether frozen indices could be used for your use case.

kernelpanic · August 28, 2019, 11:47am

Thankyou, have watched the webinar, very informative.

Frozen indices - can't use these yet unfortunately as we're not on 6.6x

Force merge - This looks promising, the documentation says it should only be done on read-only indexes which is fine for us. What i'd like to do is for indexes over a certain age, automatically set them as read-only and run a force merge on them - is there anyway to do this? I believe Curator can do force merges but can it set indexes as read-only?

Christian_Dahlqvist · August 28, 2019, 11:55am

You should not set the index to read-only before the forcemerge, just make sure it is not written to. Curator and ILM can help handle this.

kernelpanic · August 28, 2019, 1:40pm

Thankyou guys

system · September 25, 2019, 1:40pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch data node JVM Running out of memory Elasticsearch	2	484	May 8, 2020
Why is my heap usage always high? Elasticsearch	10	4985	July 5, 2017
Search on frozen index cause huge usage of JVM Heap Elasticsearch	9	548	September 2, 2019
Does 1-2 nodes with consistently high(er) heap usage indicate a problem? Elasticsearch	6	440	February 21, 2019
Elasticsearch Java Heap clarifications? Elasticsearch	2	586	July 25, 2017

JVM heap size usage and causes

Related topics