I have a ES cluster divided in two racks.
I wanted to have dedicated master-eligible nodes. I know the recommendation is having minimum_master_nodes set to N/2 + 1. I want to establish an adequate value for N. I've read also multiple recommendations on having N as an odd number but that's not an option if I have two racks.
What will happen if one of the racks is completely lost (or there's a network split)?
Scenario 1: Master is one node of the first rack. There's a network connectivity error between rack 1 and rack 2 or rack 2 completely crashes.
If I set N=4, minimum will be 3. I think (correct me if I'm wrong) that this will prevent split brain as the master election on the second rack will not have quorum (2 master eligible), however, I'm not sure if the nodes on the first rack will initiate a master election (there's a master alive, but there's no quorum).
If N=5 I would have 3 + 2 eligible masters. Minimum would still be 3 so same problems (basically only one of the two racks have the possibility to form a cluster. An odd number is not an option here.
If N=6 I would have 3 + 3, minimum would be 4 and there are the same problems... you get the idea.
Scenario 2: Master is one node of the first rack. There's a complete outage of rack 1.
The cluster will never be able to decide a master node because it will never have quorum.
What's the recommendation for N in a multiple-rack scenario? (having 3 racks maybe?)