Is there some reason why you can't use a 3rd master only node? The whole point is to not get into a split brain situation because you'll probably end up losing data once the cluster re-joins.
As to why the cluster hasn't reformed, you'll need to provide the log files about any errors. Telnet to port 9300 to each node from the other and check that this port is open.
If minimum_master_nodes was set to 2, then there would be no cluster and
not two individual one node clusters.
It appears that the nodes are not broadcasting discovery events since it
always has been online. It waits for network chatter from other nodes
joining the cluster.
Perhaps the problem is related to the fact that unicast discovery does not
attempt to resolve IPs after startup?
Uhm, isn't that by design? As soon as your cluster falls apart into separate clusters because minimum_master_nodes is too low, it will never become one single cluster again. How could it, there might be conflicts. At least that is the behavior I observed when I was experimenting with it, which was version 1.7 or less, though.
Unfortunately we have to make this work in the event that only one node is operational as well. So I can't set min_master = 2. I'll have to make-do.
Precisely. Node1 thinks node2 has died (because Node1's iptables dropped all packets going to Node2) and vice-versa.
This is just simulating the case where one of the nodes became unreachable (say due to routing issues or such). I'd tend to think there HAS to be a retry -- perhaps with exponential backoff?
Does anyone know or have a link to relevant parts of the code base that I can explore? It seems wrong that there is no retry in my case.
If you have two nodes with min masters set as one and you simulate a network partition, they will each elect themselves as masters, hence creating two clusters with the same name.
Given they both see themselves as masters they will never try to reconnect to any other nodes.
When you run with only 2 nodes, minimum_master_nodes need to be set to 2 in order to avoid separate clusters forming in the event of a network partition. With this setting, the cluster will be masterless until both nodes are available and will serve reads but reject writes. The reason for this is to avoid the risk of data loss.
As with most systems that rely on consensus based master election, 3 nodes is the magic number. If you introduce a small, third node that a dedicated master node, this will act as a tiebreaker and allow one of the sides of the partition to elect a master in the case of a network partition. This side of the partition would continue to take writes, while the single partitioned node would only serve reads.
This is e.g. what Found does. If you provision a cluster in two availability zones, it automatically provides a free third dedicated master node that acts as a tiebreaker as this helps improve availability and stability.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.