Primary and Replica Shard Sync

Let us take a sample scenario in Azure Elasticsearch setup where we have below shard, node, index allocation mapping something like below –

During planned Azure update, here is the list of things which will happen –

Couple of Qs -

  1. Will M2's shards' updated contents (when M1 was down) will be synced to M1's shard?
  2. Will M1's shards' updated contents (when M2 was down) will be synced to M2's shard?

Ping.

If you only have two nodes in your cluster and both are master eligible, you should have minimum_master_nodes set to 2. This means that when the first node goes down, Elasticsearch would stop accepting indexing requests as no master can be elected. The primary and the replica should therefore have the same data.

If you on the other hand had 3 (or more) nodes in the cluster, it would be possible to elect a master after the first node went offline and Elasticsearch will then relocate the missing shard to a node that does not already contain it. It can then continue taking writes.

Thanks @Christian_Dahlqvist

I intentionally didn't talk about master nodes... Assume these are just 2 data nodes in the ES cluster and master/client nodes are different ones.

Assuming you only have 2 data nodes and separate dedicated master nodes so that a master is available at all times, I believe Elsticsearch still will not accept the write while one doc the data nodes is down as it requires a quorum of shards to be available.

In this case, the replica count is 1... So quorum is 1 and not 2.. If primary is available, indexing will succeed.

Note, for the case where the number of replicas is 1 (total of 2 copies of the data), then the default behavior is to succeed if 1 copy (the primary) can perform the write.

If that is the case you would probably run the risk of losing some data if you do not let the cluster settle into a green state before taking the next data node down.

agreed. I would like to know how ES resolves the data inconsistency between shards in this case

Elasticsearch will take one of the shards, so any data written only to the other will be lost.

Thanks @Christian_Dahlqvist !

FYI we are all volunteers here, even those that work for Elastic. If you want SLA based response times then you should look at a Subscription with Elastic.

Otherwise, please be patient and respect your fellow community members.

Point noted. Thanks.