Elasticsearch Split brain Mechanism?

Hi,

I went through the split brain mechanism documentation in elastic.co . I have few doubts regarding documentation part:

Scenario 1:
two master-eligible nodes,minimum_master_nodes =1

  1. Any data that has been written to the restarted node will be lost. What this stmt mean whether the data(i.e cache,translog buffer etc) that is present in the node is lost or the data that was send during the restart process of the node is lost?

Scenario 2:
3 master-eligible nodes,minimum_master_nodes =2

2)As soon as the network split is resolved, the single node will rejoin the cluster and start serving requests again. It mean the that single node also need to be restarted for joining in the cluster then the data that is present in the single node also will be lost right? Then how we can say the above configuration avoid split brain?

Please correct me if i am wrong

Thanks

In order to safely be able to accept writes, a cluster must have a master node elected by a majority of master eligible nodes (to prevent multiple masters being elected). If you only have 2 master eligible nodes, a majority decision requires both nodes to be present, which is why 3 master eligible nodes is always recommended. If you have a split brain with 2 masters, each master will promote the local shards copies to primary and accept writes, which means the shard copies may start to diverge. When the cluster comes back together, one of these shards will be picked as primary and be replicated, leading to data loss.

If a node goes down or is partitioned off in this type off cluster, the side of the cluster with only 1 master eligible node will not be able to elect a master and will therefore not accept writes. The other side of the cluster can elect a master and continue service reads and writes. Shards will on this side be promoted to primary, so as long as you have a at least one replica shard you should not lose data. Once the cluster comes back together, primary shards will be copied over to the node that was separated, and it will be brought up to date.

1 Like
  1. Both.
  2. It should keep trying to rejoin automatically, but you can force a restart. However that node won't accept requests so it cannot index new data.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.