Elasticsearch HA and geocluster

Hello,
how setup HA Elasticsearch cluster in geocluster environment? We have 2 datacenter location and for each of datacenter we have 1 hardware virtualization server.

I know that best setup is use odd number of Elasticsearch nodes, so should we use two ES nodes for DC 1 a one for DC 2 or is there some better approach?

We don't really support having cluster split across different regions.

The best thing to do is to have 2 different clusters, one in each node.
I'd recommend in each region at least 2 data nodes + one master only node.

To synchronize data between zones, you can:

  • Send your data in both clusters
  • Send your data to a message queue system like Kafka and use a Logstash instance in each zone to read the queue and index documents locally
  • Use the Cross Cluster Replication feature (available with a platinum license or a trial). See https://www.elastic.co/subscriptions
1 Like

Thank you @dadoonet for your tips.
I'm a little bit suprised that Elastic does not have support out-of-the-box. Is it on roadmap?

Not really, at least not within a single cluster. Each cluster relies on the node-to-node network connections being (a) reliable, (b) low-latency and (c) high-bandwidth, and none of these are really true of connections between different regions. Multi-region deployments are supported out-of-the-box using federation: cross-cluster search and/or cross-cluster replication.

I mean technically you can split a cluster across geographically separated regions, and it'll do the right thing even if the connection is unreliable and/or slow, but it won't necessarily perform very well.

If you only have two failure domains (e.g. regions) then it is not possible to achieve high availability. At least, you cannot build a system that can tolerate the loss of either domain. This isn't a limitation of Elasticsearch, it's a fundamental property of distributed systems: with two domains one or other of them will always be critical to the health of your cluster. The recommended approach is to have at least three failure domains. Fortunately with Elasticsearch you only need a single, small, dedicated master node in the third domain and you can leave all the heavy machinery in the other two.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.