We had configured a 2 node cluster on Elasticsearch 6.x with both nodes configured as master eligible and the minimum number of master nodes configured to 1. We found this worked for our requirements particularly where if either node was temporarily removed from the cluster (if it is being upgraded or when simulating a node failure) the other node was automatically elected as the master. Following an upgrade to Elasticsearch 7.0.1 we find that this is no longer the case, when the currently elected master node is removed from the cluster the other master eligible node is not being elected as the new master. The error being reported is as follows:
[2019-07-05T14:38:31,895][WARN ][o.e.c.c.ClusterFormationFailureHelper] [eyn-dev-es1] master not discovered or elected yet, an election requires a node with id [eHJ9LGJxQDKj7Kqkwpq_OA], have discovered which is not a quorum; discovery will continue using [10.2.0.32:9300] from hosts providers and [{eyn-dev-es1}{dju6L50FSxKSPkB6-NyMCA}{P8qAO9iuR3alI-OEpg6wPQ}{10.2.0.31}{10.2.0.31:9300}{ml.machine_memory=8339152896, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 15, last-accepted version 2320 in term 15
I understand that there have been changes to the voting process for electing a new master node to avoid half or more of the nodes being removed from the cluster which I assume is what is causing this issue. If so is there any way the cluster can be configured to run with just 2 nodes and elect a new master when the master fails or will this now only work with a 3 node + configuration? If you need any further information please let me know.
There is fundamentally no way to build a distributed system with only 2 nodes that can tolerate the loss of either one of them. This isn't something that's changed between version 6.x and 7.x because it's not really a limitation imposed by Elasticsearch at all. You can at least safely do a rolling restart of a 2-node cluster in 7.x, but you simply cannot build a fault-tolerant 2-node system.
We expect that the 7.3 release will allow you to build a system from 2 proper nodes plus a less-powerful voting-only master-eligible node. In earlier versions you will need 3 full master-eligible nodes.
Would you be able to point me towards any clear documentation which outlines a minimum recommended configuration for a 3 node cluster as it looks like we will need to rethink our original plan of just having the 2 nodes.
I'm afraid it enormously depends on how you will be using your cluster. The only way to answer this kind of sizing question accurately is by measurement with a realistic workload.
I understand. Is there a roadmap available anywhere so we can see when to expect new releases as we are considering going the route of having a third node for voting only? Our implementation of Elasticsearch uses the NEST API so similarly if there is a roadmap for that as well we can work out some timescales as to when we might be able to go ahead with this change.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.