ElasticSearch and geo-redundancy

yeppers · February 25, 2016, 6:25pm

We have a requirement from our operations team that all solutions have to geo-redundant with automatic failover. We have been forbidden from using ElasticSearch because of this (even though it is the best technical solution). What do other people do for geo-redundancy? Because of the automatic failover requirement what we really need is a cluster that does two-way replication over a WAN.

thn · February 25, 2016, 6:54pm

Let's say if you have one cluster, two zones and you can afford to have 1 master, and 2 data nodes per zone. If one master goes down, you have the other one holding the cluster (actually, it is recommended to use 3 masters) With proper sharding and replica settings, shards and replicas can be distributed into two zones so if one zone goes down, the cluster is still functioning properly without losing any down time or data.

Check out this document for proper sharding into different zone
https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html

yeppers · February 25, 2016, 7:12pm

That still sounds like it is in the same data center. What about nodes in different data centers geographically far apart?

thn · February 25, 2016, 7:35pm

When you put two zones into two countries, what's holding you back is the communication link. That's not ES issue, it's the network issue. ES will replicate shards properly with data nodes in the cluster at least from what I've seen. The cluster can have data nodes in one or more locations assuming they can talk to each other.

warkolm · February 25, 2016, 8:03pm

Check out a few options here - https://www.elastic.co/blog/scaling_elasticsearch_across_data_centers_with_kafka

tinle · February 25, 2016, 8:13pm

Another solution is to use a queue, such as Kafka and Mirror Maker to replicate your data between DCs. Then ES cluster in each DC can consumed the same data from Kafka and give you redundancy.

This is what we do as we have many DCs around the world. Each ES cluster is self contained.

Tin

nik9000 · February 25, 2016, 8:27pm

For now there isn't anything built in to handle replication across data centers. Some folks form a cluster across data centers but that is fraught. The best solution out there now is to build a thing that replicates all of your changes to both data centers using your favorite tools.

Topic		Replies	Views
Current status/recommendation for Georedundant clustering Elasticsearch	8	916	July 6, 2017
ES cluster Elasticsearch	2	438	November 7, 2017
Elasticsearch HA and geocluster Elasticsearch	4	544	May 6, 2019
Elasticsearch high availability across availability zones Elasticsearch	12	2348	June 22, 2020
Elasticsearch cluster across different Geo Datacenters Elasticsearch	2	312	December 15, 2020

ElasticSearch and geo-redundancy

Related topics