Elasticsearch cluster redundancy


we are facing a challenge with the redundancy of elasticsearch cluster. Our cluster consists of 2 VNFs where each VNF consists of 3 master elasticsearch VMs, 6 data elasticsearch VMs and 3 ingest elasticsearch VMs. One of the requirements we have is to have a VNF redundancy ensuring our cluster operates normally when 1 VNF goes down for any reason. Other requirement is to avoid a split brain situation in case of unexpected networking issues between the VNFs. So, in such elasticsearch cluster we would have 6 master nodes configured to be eligible for master election. To avoid split brain situation, for master election we configured that we need to have minimum of 4 master nodes available. But, this doesn't allow us to fulfill the second requirement of having elasticsearch cluster redundancy because if we lose the VNF, our remaining VNF doesn't have enough master nodes to fulfill the number of minimum available masters for master election.

What could a good practice in solving this issue? Having more elasticsearch master nodes eligible for master election?

All thoughts and ideas are welcome and appreciated.


In order to have a highly available system that can continue operating automatically without manual intervention or risk of data loss in case you lose a full VNF/Availability Zone you need to spread this across at least 3 VNF/Availability Zones so you always can form a majority of master-eligible nodes. If you only have 2 this is not possible. This can be done by splitting the cluster evenly across the three VNF/Availability Zones or simply put a single master-eligible node in a third VNF/Availability Zone that can act as a tiebreaker.

Adding to it, there will only be a single active master with in ElasticSearch cluster at a given time. So typically having a total of 3 master eligible nodes is sufficient, and ensuring these are spread across 3 different VNF/AZ will ensure fault tolerance.

For data nodes, you can spread them across 2 different VNF/AZ as it will ensure each VNF/AZ to be replica of the other. Hope this helps.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.