Elastic search Kubernetes master restarts when snapshot triggered

Elastic search Kubernetes restarts when the snapshot is triggered.

Getting error OutOfMemoryError: Java heap space. My elastic search is having
2 master, 1 data, and 1 voting node.

Having 850 indices and 50 open indices.

When I try to delete the snapshot triggered the elastic search 2 master nodes keep restarting with the same errors and the snapshot is also not deleted.

An elastic search snapshot is triggered every 30 minutes.

Error
"{"type": "server", "timestamp": "2021-04-21T04:32:37,271Z", "level": "ERROR", "component": "o.e.b.ElasticsearchUncaughtExceptionHandler", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-master-1", "message": "fatal error in thread [elasticsearch[elasticsearch-es-master-1][snapshot][T#1]], exiting",
"
"stacktrace": ["java.lang.OutOfMemoryError: Java heap space",
"

Hey @rajkumar_25, thanks for your question.

How much resources your Elasticsearch containers have now? It might be that creating snapshot pushes the process memory needs beyond what is available. Did you try increasing memory requests and seeing if the restarts still happen? You can check out our docs to see how to do that.

Thanks,
David

@dkow Thanks, we have our resources with the below configurations,

master - 2 nodes (2 GB/ per node) 1GB JVM,
Voting only master -1 node (2 GB) 1GB JVM
data node - 1 node (2 GB) 1GB JVM

Still, we need to increase?

Also, we have only small index data in KBs.

It's difficult to say without knowing exactly what and where you have. You did mention the number of indices is substantial. At the same time the JVM heap size fairly small. I'd say it's worth to try increasing the memory and checking if you still can see the issue.

Thanks,
David

@dkow Thanks

As you suggested we have increased the memory and its works fine.

Our current configuration

master - 2 nodes (4 GB/ per node) 2GB JVM,
Voting only master -1 node (2 GB) 1GB JVM
data node - 1 node (2 GB) 1GB JVM

We do have 917 Shards, each shard is less than 10 MB, and why requires 2GB of JVM or heap memory?

Each index is having one primary and one replica.

You have too many shards for a single small data node, the recomendation is to have less than 20 shards per GB of heap memory, and your shards size are also too small.

Take a look at this blog post that explains how the number and size of shards can impact your cluster.