Garbage Collection Not Working

Hi,

We have a 4 node cluster with "heap_size": "29500m", and we're consistently seeing heap usage go over 80% for hours, reaching up to 99%, and no garbage collection takes place. Does anyone know why GC would not be taking place? Our current solution is to restart the elasticsearch service for each node, and then heap usage goes back to normal.

Side Note: We haven't been able to take a heap dump when usage is this high because we need this cluster to be available.

Hi,

please always include version information and ideally also your relevant configuration (i.e. which garbage collector are you using?). Assuming that you have garbage collection logs configured (they're on by default since Elasticsearch 6.2.0) I suggest you inspect the GC logs. In the standard configuration the garbage collector is triggered at 75% of heap usage.

Daniel

1 Like

Version is 6.6.1, and we haven't modified any settings relevant to GC, our settings (minus some names) are below.

I tried to look for GC logs but I can't find anything, I only see the regular elasticsearch logs in the logging location.

path.data: "/var/lib/elasticsearch"
path.logs: "/var/log/elasticsearch"
node.attr.rack: []
network.host:
- _local_
- _site_
bootstrap.memory_lock: true
action.destructive_requires_name: true
node.master: true
node.data: true
node.ingest: true
indices.query.bool.max_clause_count: 2048
discovery.zen.ping_timeout: 30s
discovery.zen.no_master_block: write
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.unicast.hosts.resolve_timeout: 5s
gateway.expected_nodes: 0
gateway.expected_master_nodes: 0
gateway.expected_data_nodes: 0
gateway.recover_after_time: 5m
index.store.preload:
- "*"
index.store.type: mmapfs
thread_pool.analyze.queue_size: '200'
http.max_content_length: 500mb

and jvm.options

-Xms29500m
-Xmx29500m
-Djava.io.tmpdir=/var/data/elasticsearch/tmp
-Dlog4j2.disable.jmx=true

Am I correct in thinking that "Dlog4j2.disable.jmx=true" is the reason I don't see GC logs?

Hi,

GC logs are configured in config/jvm.options and the jvm.options that you've mentioned in your post above don't include any GC-related settings. Out of the box, Elasticsearch uses the CMS garbage collector. Can you please paste the output of the nodes info API?

curl -s "http://YOUR_ES_HOST:9200/_nodes/jvm?pretty"

No.

You need ensure that the following lines are present in config/jvm.options (source ):

## JDK 8 GC logging

8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:/var/log/elasticsearch/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m

# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m

After you've restarted Elasticsearch, you should see a file gc.log in /var/log/elasticsearch.

Daniel

Hi,

in the output that you've shared privately with me we can see:

        "gc_collectors" : [
          "PS Scavenge",
          "PS MarkSweep"
        ]

This means that the CMS garbage collector is in use. This is because you're running on JDK 8 and it is the default garbage collector there. The provided JVM options are (I've omitted any system properties):

        "input_arguments" : [
          "-Xms29500m",
          "-Xmx29500m"
        ]

This is very minimal (even the GC is unspecified) so it appears to me you have a very non-standard jvm.options file. I suggest you compare your jvm.options file with Elasticsearch's default jvm.options file for version 6.6 and add any missing lines. Note that that file is only a template, so you need to ensure to replace any placeholders (e.g. ${heap.dump.path}) with proper paths on your system (see the comments in the file for guidance).

After you've done this, you should have:

  • Correct system properties set (you're missing several of them, e.g. -Dio.netty.noUnsafe=true)
  • JVM options are set so you will have garbage collection logs and also the garbage collector is explicitly configured instead of implicitly chosen by the JVM (this could bite you when upgrading the JVM).

If in doubt you can always share any files or output privately.

Daniel

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.