AWS network issue and data loss

We are using es 2.4 version. All our nodes uses AWS ephemeral disk. During the network connectivity issue there is a chance that nodes may drop out of the cluster. This may cause data loss. Is there a settings that can be used to prevent this situation? Our cluster is setup with zone awareness. So, the shards are spread evenly across 3 az's.

Why would this cause data loss?

@warkolm When two nodes leaves at the same time and if it hosts primary and replica of the same shard.

Then the best way to prevent that is to have 2 replicas.

Instead is there any settings like wait for x minutes before reallocating shards to another node? Will this prevent data loss?

If the nodes that contain both primary and replica are gone, you cannot reallocate the data.

In that scenario what happens when the nodes rejoin?

Then things should return to green.

Will there be any data consistency issue after the nodes rejoins?

There will be no state changes to these during that point at they are not part of the cluster.

You should really upgrade :slight_smile:

I agree we have to upgrade our es version. Thanks for your help on this.

I would also recommend consistent and frequent backing up by using the snapshot utility.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.