Troubles with JVM Heap

Hello there!

After a considerable time optimizing and testing different configurations, I have no idea why my ES nodes uses so much of a JVM heap, with not so many shards in a cluster. I’m definitely missing something, but what?
Elastic Stack version is 6.7.2
Cluster consist of 5 ES nodes (each running in its Docker container), each node has 64 GB RAM, 32 CPU cores. There is clearly enough storage on HDDs
JDK version:

openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)

Every ES node have this jvm.options:

-Xms32768m
-Xmx32768m
-XX:NewRatio=2
10-:-XX:-UseConcMarkSweepGC
10-:-XX:-UseCMSInitiatingOccupancyOnly
10-:-XX:+UseG1GC
10-:-XX:InitiatingHeapOccupancyPercent=75
-Des.networkaddress.cache.ttl=60
-Des.networkaddress.cache.negative.ttl=10
-XX:+AlwaysPreTouch
-Xss1m
-Djava.awt.headless=true
-Dfile.encoding=UTF-8
-Djna.nosys=true
-XX:-OmitStackTraceInFastThrow
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Djava.io.tmpdir=${ES_TMPDIR}
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=data
-XX:ErrorFile=logs/hs_err_pid%p.log
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
# due to internationalization enhancements in JDK 9 Elasticsearch need to set the provider to COMPAT otherwise
9-:-Djava.locale.providers=COMPAT
10-:-XX:UseAVX=2

There is 315 timely-rotating indices (+/- 2 indices, depends on Curator actions), where:

  • Daily Indices “A” are usually 730 GB big (530 GB on weekends) – 7 shards, 1 replica
  • Weekly indices “B” are usually 220 GB big – 2 shards, 1 replica
  • Monthly indices “C” are usually 160 GB big – 6 shards, 1 replica
  • Monthly indices “D” are usually 800 MB big – 1 shard, 1 replica
  • Weekly indices “E” are usually 400 GB big – 3 shards, 1 replica
  • Weekly indices “F” are usually 400 GB big – 5 shards, 1 replica
  • Monthly indices “G” are usually 10 GB big – 1 shard, 1 replica

There is only about 15 indices that are yearly (few hundred MBs, 1 shard, 1 replica), but considering the possible volatility of log volume in indices “D” and “G”, I can’t risk migrating them to yearly indices. Indices “C” were twice as bigger in past, that’s why I left them on 6 shards... will definitely lower the number of shards for future indices (and shrink for past indices) to 3 shards.

Yes I know, most of these indices can be optimized, according to storage capacity they take on disk, with +/- 1 shard. Still, correct me if I’m wrong - if I have 31 GB JVM Heap size on one node, such node can deal with max 600 shards that ideally take 10GB-50GB of data. Following the info (above) I mentioned about my indices and retention period of 3 months, I should have about 1600-1700 shards active in cluster, which is about 320 shards per node.

At 200 shards per node my cluster was “dying” - Kibana timeouts (ES API overall), Logstash logs emitted rate very low, etc. also all of my ES nodes had a lot of this messages (every few seconds) in cluster logs:

[2021-07-09T03:52:45,630][WARN ][o.e.m.j.JvmGcMonitorService] [elasticsearch-node0] [gc][25848] overhead, spent [2s] collecting in the last [2.5s]

Then I close about 20 indices and cluster performance was significantly better (no Kibana timeouts, normal logs emitted rate by Logstash, etc). I took a screenshot of my ES Nodes Monitoring section:

Also, on the images below, you can see Overview and Advanced graphs on one for the nodes - quite similar graph values are on all of the 5 nodes (notice the red arrows - that's the period when I closed 20 indices):

"Striped" output of _node/stats API for one of the nodes:

{
  "timestamp" : 1625837474508,
  "name" : "elasticsearch-node0",
  "transport_address" : "10.0.0.195:9300",
  "host" : "10.0.0.195",
  "ip" : "10.0.0.195:9300",
  "roles" : [
    "master",
    "data",
    "ingest"
  ],
  "attributes" : {
    "ml.machine_memory" : "67151585280",
    "xpack.installed" : "true",
    "ml.max_open_jobs" : "20",
    "ml.enabled" : "true"
  },
  "indices" : {
    "docs" : {
      "count" : 2857527884,
      "deleted" : 6
    },
    "store" : {
      "size_in_bytes" : 5572660647652
    },
    "indexing" : {
      "index_total" : 63089414,
      "index_time_in_millis" : 58053777,
      "index_current" : 0,
      "index_failed" : 0,
      "delete_total" : 5314,
      "delete_time_in_millis" : 263,
      "delete_current" : 0,
      "noop_update_total" : 0,
      "is_throttled" : false,
      "throttle_time_in_millis" : 0
    },
    "search" : {
      "open_contexts" : 0,
      "query_total" : 99215,
      "query_time_in_millis" : 3852985,
      "query_current" : 0,
      "fetch_total" : 10869,
      "fetch_time_in_millis" : 24792,
      "fetch_current" : 0,
      "scroll_total" : 28636,
      "scroll_time_in_millis" : 1696355964,
      "scroll_current" : 4,
      "suggest_total" : 0,
      "suggest_time_in_millis" : 0,
      "suggest_current" : 0
    },
    "merges" : {
      "current" : 3,
      "current_docs" : 11051914,
      "current_size_in_bytes" : 11327555108,
      "total" : 15773,
      "total_time_in_millis" : 97119159,
      "total_docs" : 239746379,
      "total_size_in_bytes" : 395064523800,
      "total_stopped_time_in_millis" : 0,
      "total_throttled_time_in_millis" : 76073250,
      "total_auto_throttle_in_bytes" : 3850782460
    },
    "refresh" : {
      "total" : 66535,
      "total_time_in_millis" : 6307176,
      "listeners" : 0
    },
    "flush" : {
      "total" : 493,
      "periodic" : 244,
      "total_time_in_millis" : 530956
    },
    "warmer" : {
      "current" : 0,
      "total" : 59137,
      "total_time_in_millis" : 4953
    },
    "query_cache" : {
      "memory_size_in_bytes" : 144392136,
      "total_count" : 5969326,
      "hit_count" : 135527,
      "miss_count" : 5833799,
      "cache_size" : 1064,
      "cache_count" : 2622,
      "evictions" : 1558
    },
    "fielddata" : {
      "memory_size_in_bytes" : 2094144,
      "evictions" : 0
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 4389,
      "memory_in_bytes" : 8326247746,
      "terms_memory_in_bytes" : 6990788892,
      "stored_fields_memory_in_bytes" : 1114132112,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 84480,
      "points_memory_in_bytes" : 186489802,
      "doc_values_memory_in_bytes" : 34752460,
      "index_writer_memory_in_bytes" : 326691289,
      "version_map_memory_in_bytes" : 1922,
      "fixed_bit_set_memory_in_bytes" : 7384,
      "max_unsafe_auto_id_timestamp" : 1625831845465,
      "file_sizes" : { }
    },
    "translog" : {
      "operations" : 3204038,
      "size_in_bytes" : 5353055060,
      "uncommitted_operations" : 1958679,
      "uncommitted_size_in_bytes" : 3268641912,
      "earliest_last_modified_age" : 0
    },
    "request_cache" : {
      "memory_size_in_bytes" : 582347,
      "evictions" : 0,
      "hit_count" : 406,
      "miss_count" : 423
    },
    "recovery" : {
      "current_as_source" : 0,
      "current_as_target" : 0,
      "throttle_time_in_millis" : 1894860
    }
  },
  "os" : {
    "timestamp" : 1625837477914,
    "cpu" : {
      "percent" : 12,
      "load_average" : {
        "1m" : 5.07,
        "5m" : 4.96,
        "15m" : 5.01
      }
    },
    "mem" : {
      "total_in_bytes" : 67151585280,
      "free_in_bytes" : 470204416,
      "used_in_bytes" : 66681380864,
      "free_percent" : 1,
      "used_percent" : 99
    },
    "swap" : {
      "total_in_bytes" : 0,
      "free_in_bytes" : 0,
      "used_in_bytes" : 0
    },
    "cgroup" : {
      "cpuacct" : {
        "control_group" : "/",
        "usage_nanos" : 197301219127459
      },
      "cpu" : {
        "control_group" : "/",
        "cfs_period_micros" : 100000,
        "cfs_quota_micros" : -1,
        "stat" : {
          "number_of_elapsed_periods" : 0,
          "number_of_times_throttled" : 0,
          "time_throttled_nanos" : 0
        }
      },
      "memory" : {
        "control_group" : "/",
        "limit_in_bytes" : "9223372036854771712",
        "usage_in_bytes" : "64025772032"
      }
    }
  },
  "process" : {
    "timestamp" : 1625837477914,
    "open_file_descriptors" : 1556,
    "max_file_descriptors" : 65536,
    "cpu" : {
      "percent" : 12,
      "total_in_millis" : 197299520
    },
    "mem" : {
      "total_virtual_in_bytes" : 5608245342208
    }
  },
  "jvm" : {
    "timestamp" : 1625837477915,
    "uptime_in_millis" : 61324914,
    "mem" : {
      "heap_used_in_bytes" : 31318506960,
      "heap_used_percent" : 91,
      "heap_committed_in_bytes" : 34359738368,
      "heap_max_in_bytes" : 34359738368,
      "non_heap_used_in_bytes" : 184341472,
      "non_heap_committed_in_bytes" : 225312768,
      "pools" : {
        "young" : {
          "used_in_bytes" : 10619977728,
          "max_in_bytes" : 0,
          "peak_used_in_bytes" : 11442061312,
          "peak_max_in_bytes" : 0
        },
        "old" : {
          "used_in_bytes" : 20195212752,
          "max_in_bytes" : 34359738368,
          "peak_used_in_bytes" : 34359738368,
          "peak_max_in_bytes" : 34359738368
        },
        "survivor" : {
          "used_in_bytes" : 503316480,
          "max_in_bytes" : 0,
          "peak_used_in_bytes" : 1442840576,
          "peak_max_in_bytes" : 0
        }
      }
    },
    "threads" : {
      "count" : 321,
      "peak_count" : 396
    },
    "gc" : {
      "collectors" : {
        "young" : {
          "collection_count" : 5986,
          "collection_time_in_millis" : 338246
        },
        "old" : {
          "collection_count" : 1528,
          "collection_time_in_millis" : 2796699
        }
      }
    },
    "buffer_pools" : {
      "mapped" : {
        "count" : 12572,
        "used_in_bytes" : 5553175857927,
        "total_capacity_in_bytes" : 5553175857927
      },
      "direct" : {
        "count" : 362,
        "used_in_bytes" : 1838783459,
        "total_capacity_in_bytes" : 1838783458
      }
    },
    "classes" : {
      "current_loaded_count" : 19650,
      "total_loaded_count" : 20695,
      "total_unloaded_count" : 1045
    }
  },
  "fs" : {
    "io_stats" : {
      "devices" : [
        {
          "device_name" : "dm-1",
          "operations" : 60515516,
          "read_operations" : 2721390,
          "write_operations" : 57794126,
          "read_kilobytes" : 324560152,
          "write_kilobytes" : 1584071716
        }
      ],
      "total" : {
        "operations" : 60515516,
        "read_operations" : 2721390,
        "write_operations" : 57794126,
        "read_kilobytes" : 324560152,
        "write_kilobytes" : 1584071716
      }
    }
  },
  "transport" : {
    "server_open" : 52,
    "rx_count" : 22198209,
    "rx_size_in_bytes" : 389066700224,
    "tx_count" : 22198256,
    "tx_size_in_bytes" : 294157797987
  },
  "http" : {
    "current_open" : 106,
    "total_opened" : 27478
  },
  "breakers" : {
    "request" : {
      "limit_size_in_bytes" : 20615843020,
      "limit_size" : "19.1gb",
      "estimated_size_in_bytes" : 0,
      "estimated_size" : "0b",
      "overhead" : 1.0,
      "tripped" : 0
    },
    "fielddata" : {
      "limit_size_in_bytes" : 20615843020,
      "limit_size" : "19.1gb",
      "estimated_size_in_bytes" : 2094144,
      "estimated_size" : "1.9mb",
      "overhead" : 1.03,
      "tripped" : 0
    },
    "in_flight_requests" : {
      "limit_size_in_bytes" : 34359738368,
      "limit_size" : "32gb",
      "estimated_size_in_bytes" : 3436,
      "estimated_size" : "3.3kb",
      "overhead" : 1.0,
      "tripped" : 0
    },
    "accounting" : {
      "limit_size_in_bytes" : 34359738368,
      "limit_size" : "32gb",
      "estimated_size_in_bytes" : 8326232996,
      "estimated_size" : "7.7gb",
      "overhead" : 1.0,
      "tripped" : 0
    },
    "parent" : {
      "limit_size_in_bytes" : 24051816857,
      "limit_size" : "22.3gb",
      "estimated_size_in_bytes" : 8328330576,
      "estimated_size" : "7.7gb",
      "overhead" : 1.0,
      "tripped" : 0
    }
  }
}

So, what am I missing? Why do I get poor performance with 200 shards per node, even if I have 32 GB of Heap and “normally” sized shards on my indices?

Thanks in advance!

Your heap need to be less than 32GB in order to benefit from compressed pointers. I believe the limit is around 30Gb or so but may vary. See this blog post for further details.

For older indices that are no longer written to you can also reduce heap usage by forcemerging the shards down to a single segment and then make them read-only.

You are also running a very old version so would benefit from upgrading to a more recent version.