Multi-availability-zones HA cluster on AWS EC2 instances


I am more of a developer than ops, so I am wondering how to deploy a HA cluster on AWS' EC2 instances. The thing is, I want this cluster to survive a zone failure...

Here's what I have planned so far:

  • 7.4.0
  • Across 2 availability zones (say, A and B)
  • Each node will be properly configured with an attribute that indicates to which zone it belongs
  • Cluster and indices will be aware of that, so it can allocate replicas to the other zone
  • I was thinking of having 3-node in each zone, so if one zone completly fails, the election between the three remaining node is successeful
  • Each node will act as master eligible and data

The questions I have now:

  • Do we still have to have at least 3 node to avoid split brain? (I didn't kept up to date with some change in > 7 that I think solved that)
  • In this scenario the total 6 node must be working all the time. Is there a more cost effective way to assure the HA cross availability zones?
  • How should I set the network configuration for the discoery phase? Install and use the EC2 plugin?
  • And what about the clients? How should they be configured to access this cluster in a HA fashion? Should I list all the nodes IPs in the "hosts" attribute, so in case of a failure it trys the next one?
  • Or should I have some balancer in front of the cluster?

I'll appreciate any info you can give me. If there's already a post, an Elastic's blog or an article giving guidelines around this scenario, please send them my way.

As I said, I more used to writinf queries than to deploying clusters...

Thank you!

1 Like

Try HA Proxy with load balancing in Layer 7, you can check here..

1 Like

In order to make your cluster highly available I case of AZ failure you need to deploy the cluster across 3AZ so that a strict majority of master eligible nodes will survive the crash.

1 Like

Hello @Christian_Dahlqvist

I am curious about the need for 3 AZs. When it comes to be protected from the failure of one AZ my reasoning is:

  • A minimal production cluster formation needs 3 master eligible nodes to have a successeful voting process
  • In the event of a AZ failure, I'd need other 3 master eligible nodes in the remaining zone in order to a new master to assume
  • Other than that I would need all these 3 nodes (which will also be data nodes) to have the necessary primary and replica shards to assure green state

So the way I see, I would need 6 nodes, 3 in each zone, so that if I completly lose a zone, the other one has everything it needs to assume. In a health zone state I would have a 6 node cluster, which would probably help to improve performance, but seens totaly unecessary and expensive to my use case...

Am I missing something?
Do you guys know some article that gives guidelines for deploying a cluster in a multi-zone architecture?

By the way, thanks for having interest and taking the time to answer my question.

All nodes across both AZ form a single cluster. In any failure scenario you need a strict majority of master eligible nodes to still be available in order to have a healthy and functioning cluster. If you have 6 nodes a minimum of 4 (majority of 6) nodes which does not allow a full AZ to go down. If you however have at least one master eligible node in a third AZ you get around this as the majority of 7 is still 4.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.