If Zone B loses connectivity with Zone A, it cannot tell if
that means that the network failed and Zone A is actually still operating, in which case Zone B should stand down, or
that all of Zone A failed, and Zone B should take over control of the cluster.
There is no possible safe configuration because there is no way for the software to distinguish between the two scenarios.
For this to work you need a 3rd zone with at least 1 node it in to act as a tie-breaker. The safe option is 2 nodes in Zone A, 2 nodes in Zone B, 1 node in Zone C. Then, as long as 2 nodes zones are available and can communicate with one another, they will have a majority and will control the cluster.
The cluster may stop working until the connection is restored, but will not lose data. Some people call this "split brain". The manual covers how to set up your cluster for high availability in considerable depth.
The one with the majority of the master-eligible nodes will carry on working, the other side will not. If neither side has a majority of the master-eligible nodes then they may both stop working.
Sounds fine to me, although running so many nodes on a single server is pretty risky as you will lose them all when the server breaks. Better to have more/smaller servers.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.