ElasticSearch on AWS - Disaster Recovery?

We're in the process of productionizing our US-EAST ELK stack. However, for the purpose of Disaster Recovery, we need to setup a US-WEST ELK stack as well. I Wanted to understand, after we setup the WEST (And assuming all traffic currently goes to EAST), how do we synchronize ES Indexes so that WEST is in sych with EAST ?
Natuarlly, when we shift traffic to WEST, we want all the data in EAST to be available in WEST. How would one achieve this?

If you're going to set up your cluster in two DCs (west + east), you can assign half of your data nodes to one DC and the other half to the other DC. Then, using shard allocation awareness, you can split your indices' primary and replica shards across the two DCs in such a way that a replica shard never ends up in the same DC as its primary shard. That way each DC has a complete copy of your data and if one DC goes down, you don't lose any data and your cluster can still operate.

For full HA, you also need to have one master in each data center as well as another master node in a third DC (tie-breaker) and configure discovery.zen.minimum_master_nodes: 2

With such a setup, you don't need to synchronize anything, ES does it by default. Your ES clients need to be configured so they can reach either DC.

I invite you to read the following link which contains great info on this subject: https://www.elastic.co/guide/en/cloud-enterprise/current/ece-planning.html#ece-ha

Trying to deploy single cluster across multiple data centres is not supported unless they are very close and have very good connectivity, and is likely to result in instability and poor performance. If you have data centres as far apart as in your example, it is generally recommended to set up two separate clusters and ensure that you feed all changes to both clusters in parallel. Exactly how to best do this will depend on the use-case and how you are ingesting and/or updating data.

Thanks for pointing that out @Christian_Dahlqvist, I should have used the term "availability zone" instead of "DC", indeed, to be in synch with the ES Cloud lingo.

Thank you - this is very helpful. What I currently have is 3 (1:1 Replica) nodes all running as masters in EAST. If we don't do full HA - are you suggesting I setup EAST/WEST as 3 nodes in each using the shard allocation awareness? apologies - I couldn't understand what is meant by "half" here.

PS : I understsand you mean AZs - just wanted to clarify I'm in completely different regions (us-east-1 PRIMARY and us-west-2 as DR). Would shard allocation work across regions? I looked at the link you have listed above but it uses "Zones" as verbiage - nothing specific to Regions

No, you should never split a cluster across regions. Independent clusters that you feed in parallel is the way to go.

ITs more of a mandate from my organization to have running clusters in TWO separate Regions for Disaster recovery.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.