Elastic Clusters across Datacenters

Dear All,

We are having 2 separate clusters in 2 different data centers and currently they are active-active.
After Elastic 6.0, synchronization will be in place.

So if data ingestion is enabled for one Elastic Cluster/Data Center with syncing to the other Data Center (from A to B) is enabled, Can I run from one Data Center and stop maintaining two distinct-parallel Clusters? At the time of the switch over, the sync process will be enabled from B to A. (making sure first A to B sync first is completed)

is this the right approach? Does anyone tried this and what were the caveats?

Thanks.
Cengiz

How are you going to synchronise? How far apart are the data centres?

They are only 2 different buildings in the same city (on-premise Data Centers of an enterprise with good enough link/bandwidth between the two. I was thinking to use ES features to keep any 2 cluster in synch? !!

OK, so you are going to set up a single cluster that spans the 2 data centres and use shard allocation filtering to ensure shards are distributed correctly?

Yes. Would that be proposed to do, assuming only one ES Cluster will be in use (accept ingestion)? Any issues during swithovers?

In this scenario all indexing requests will go to both data centres as they both should hold a copy of every shard, so it may have an impact on performance and lead to increased traffic between the data centres. Queries will also be executed across both data centres. Depending on your data and query volumes, I would recommend benchmarking this to make sure the connection between the data centres is fast enough and have sufficient bandwidth.

Given that you have 2 data centres, it is also very difficult, if not impossible, to make this highly available with respect to a full data centre failing. The reason for this is that it is impossible to evenly distribute and odd number of master eligible nodes across the 2 data centres.

This is therefore something we generally do not recommend nor support. If you are looking for a DR setup, the architecture you currently have in place is in my opinion better.

I am not sure I understand what you mean by switchover. Could you please explain?

Thanks Christian. My understanding is that even if ES 6.0 has Synch feature, it should not be used unless the other Data Center is DR. So we will need to continue maintaining 2 separate ES Clusters ingesting the same data separately. Thx...

I am not sure what sync feature you are referring to. Sequence IDs were introduced in Elasticsearch 6.0, and this is the foundation for building cross-datacenter replication, but such feature is not yet available.

Yes, that is what I meant "Sequence IDs were introduced in Elasticsearch 6.0, "

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.