ECK Sudden rise in data disk usage

rajkumar_25 · October 5, 2020, 6:47am

Deployed elastic search Kubernetes in GKE. With 2GB memory and 1GB persistence disk.

We got an error out of storage exception. After that, we have Increased to 2GB on the next day itself it reached 2GB, but we haven’t run any big queries. Then again we have increased the persistence disk size to 10 GB. After that, there is no increase in the data persistence disk storage.

On further analysis, we have found total Indices take 20MB of memory unable to what are the data in the disk.

Used elastic search nodes stats API to get the details on disk and node statistics.

I am unable to find the exact reason why memory exceeds and what are the data in the disk. Also, suggest ways to prevent this future.

pebrc · October 5, 2020, 8:57am

Judging from your screenshot I suspect your instance ran out of memory a few times and a JVM heap dump was taken. This is because Elasticsearch runs by default with the following options -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=data which mean that it will take a heap dump when running out of memory and it will store it in the data directory. With only 1G of disk space and 2G of memory it takes only one of these heap dumps to fill up your data directory.

How to avoid this problem? There are multiple approaches:

Make sure there is enough space on disk for the heap dumps or define an alternative path for the heap dumps e.g. on a emptyDir volume and configure said path via -XX:HeapDumpPath
Turn off heap dump creation in on out of memory errors e.g. with something like

kind: Elasticsearch
metadata:
  name: quickstart
spec:
  version: 7.9.2
  nodeSets:
  - name: default
    count: 1
    podTemplate:
      spec:
        containers:
        - name: elasticsearch
          env:
          - name: ES_JAVA_OPTS
            value: -XX:-HeapDumpOnOutOfMemoryError

Make sure your Elasticsearch instance does not run out of memory to begin with by giving it more memory as described in the ECK docs

rajkumar_25 · October 8, 2020, 11:05am

@pebrc thanks for the reply.
What is the purpose of this heap dump and is it good to disable it?

pebrc · October 8, 2020, 12:15pm

It is a debugging tool that can allow you to do a post-mortem analysis of a JVM process that (like Elasticsearch) to answer questions like "what did take up all the memory that lead to the OOM event" or similar.

Is it good to disable it? It depends. If you disable this setting and your Elasticsearch instance runs out of memory you will not be able to get a heap dump after the fact. You can then of course re-instate the setting and capture the heap dump the next time round if the out of memory situation happens repeatedly.

In general out of memory situations should be less likely to happen on recent versions of Elasticsearch due to additional safe guards like circuit breakers that have been added to Elasticsearch to prevent operations that would use more than the available amount of memory.

Hope that helps.

Topic		Replies	Views
Disk write and eviction Elasticsearch	9	914	July 5, 2017
java.lang.OutOfMemoryError Kubernetes GKE Elasticsearch	2	1623	February 12, 2018
Frequent OOM Elasticsearch	3	366	July 6, 2017
Searching from the big index - Java heap space exception Elasticsearch	3	488	July 6, 2017
Elastic search OOM after running 3 months Elasticsearch	3	1124	March 24, 2017

ECK Sudden rise in data disk usage

Related topics