Elasticsearch 7.1x + Java 11: Possible GC misconfiguration

Hey guys,

We're facing some problems with instances going down from time to time with "exiting
java.lang.OutOfMemoryError: Java heap space".

Currently:

  • Elasticsearch 7.1.1
  • Java 11
  • G1GC and default jvm.options configs
  • 2.5b docs.
  • 1 index with 40 shards (20 p + 20 r)
  • 2.2TB in primary data.
  • 12 x 32gb ram instance with 16gb allocated to Elasticsearch (50%)
  • cluster in AWS

In logs, we can also see this logs from GC:

[gc][2072844] overhead, spent [6s] collecting in the last [6.1s]

[old][2072844][25] duration [6s], collections [1]/[6.1s], total [6s]/[2.2m], memory [15.3gb]->[15gb]/[16gb], all_pools {[young] [8mb]->[0b]/[0b]}{[old] [15.3gb]->[15gb]/[16gb]}{[survivor] [0b]->[0b]/[0b]}

When the instance go with "OutOfMemoryError: Java heap space" and stop the service, I see that the GC count and time increased for that instance

Also, we run some tasks every day that generates stats from docs, which can take up to the last 30 days of data. When this task runs, sometimes the same problems occur, but not always.

Do you guys have any clue in where I can go further and debug this?

Hi @Bruno_Andrade,

the first stop I think would be to load the heap dump produced by this into a tool to examine what objects you have. I can recommend eclipse memory analyzer, but there are plenty other tools available too (jprofiler, yourkit etc.).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.