I have a 5 node cluster with 3 master eligible nodes and 2 dedicated data nodes.
In the current cluster state, the current master node left owing to long GC's.
After re-election, master was assigned to some other master eligible node. I get the following exception in my Elasticsearch logs on one of the dedicated slave node.
java.lang.IllegalStateException: cluster state from a different master than the current one, rejecting (received {cls-es-slave1}{Bh9_gR2jRqiLn3IjNfHYpA}{10.240.0.18}{10.240.0.18:9300}{master=true}, current {cls-es-master}{Z1O52E-fRIu2itHXL3l1Xg}{10.240.0.15}{10.240.0.15:9300}{master=true})
No, But I guess when a new master is elected, the cluster should be healthy again automatically.
If I had minimum master nodes set to 2 and I have 3 master eligible nodes out of which one is the current master. Now if the current master goes down and a new one is elected. The cluuster should be automatically up and healthy again.
Well it looks like you have multiple masters sending out conflicting updates, whereas if you had min masters set then only 1 master would ever be active and there wouldn't be the conflict.
But if I had the min master set, then the cluster would have been inoperable as the cluster would have waited for those many masters to join. How is this fault tolerant?
If you have 3 masters then min masters is 2, so you can still lose a master and maintain availability.
If you don't set it then you risk data loss and corruption.
It's a balance for sure, but I'd prefer consistency over availability myself, cause what's the point of having access to the data if it's wrong?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.