Hi,
I have done some reading about the splitbrain scenario, but seems to have hit a dilemma.
e.g for this structure:
machine1: 2 ES nodes (with rack id 1)
machine2: 2 ES nodes (with rack id 2)
In the above case, all the settings are default, except for awareness, which i wanted the pri/replica shards to be distributed across racks.
So if there is a network issue between the 2 machines, I will have a split brain scenario since min. master is 1.
I can set min. master to 3 to avoid split brain:
but ES will be inoperable if network issue happens or even
if 1 machine goes down, the ES is also inoperable, thus losing HA
if I set min.master to anything less than 3,
I will get split brain if there is network issue between the 2 machines
but if 1 machine goes down, I can still have 1 machine which is working.(HA)
I am not sure this is a right design or I am missing some better configs.
Any advise will be great!
You should set minimum_master_nodes to 3, as this will allow 1 node to go down. In order to build a HA cluster where a machine can be allowed to fail while still allow the cluster to be fully operational, a minimum of 3 machines is required (this can however be much smaller as it generally only need to host a single, dedicated master node).
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
discovery.zen.minimum_master_nodes: 3
total number of master-eligible nodes mean all es nodes have below config.
"You should set minimum_master_nodes to 3, as this will allow 1 node to go down. "
Yes this is the best it can go i guess, because if one more node goes down, the ES will be inoperable since min.master of 3 is not met.
"In order to build a HA cluster where a machine can be allowed to fail while still allow the cluster to be fully operational, a minimum of 3 machines is required (this can however be much smaller as it generally only need to host a single, dedicated master node)."
So basically, for these 2 machines with 2 nodes, its not really an ideal implementation if we need HA and avoid split brain at the same time? what do you mean by (much smaller)?
Hi, yes i understand that variable, but I will lose the ES if one machine goes down since min.master is not met. Avoided split brain, but causes the ES to stop working.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.