JVM HEAP usage over 85% during snapshots

Hello :slight_smile:

I have an issue with elasticsearch and HEAP usage during snapshots.

My cluster have 7 nodes (2cpu & 7.5G of RAM per node, local SSD storage), about 220 millions of documents for a total size of 377GB worth of data (reported by Kibana).
I'm running version 5.5.2 (I Know it is EOL but that's what I'm stuck with).

I'm currently implementing hourly snapshots to a s3 bucket.

While everything is working fine, during the snapshot the JVM heap rises over 85%, triggering our monitoring system.

I have set the heap size to 50% of my available RAM according to the manual, so I think configuration is good.

I attached a capture of one node during the snapshot.

I'm strugling to find something to do to prevent such JVP HEAP spikes.

Is it dangerous (can crash the cluster/node)?
What can I do about this?

Thanks for your help :slight_smile:

Regards

Hi there,

Any advices? :slight_smile:

Hi,

Still no clue on what's happening, no one has experienced this behavior?

I've already experimented issues like you but I'm reading about it, I think we need to explicit say to elasticsearch clean jvm after snapshots, but I'm reading about it.

Yes, Ram usage after snapshot returns to normal but what is worrying me is Ram usage when snapshot is launched.

Since it reach > 85%, my fear is that it kills the node or slow down the whole cluster.
I already saw our cluster slowed down to a halt just because of one unhealty node, so I'm not verry confident about this happening.

What are my options to have hourly backups of our cluster if snapshotting causes a risk?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.