Merge two elasticsearch clusters

Hi All,

Currently we have two clusters, in one we have 4 nodes(1 coordinating, 3 data nodes out of which two are master) and second cluster(1 coordinating, 5 data nodes out of which three are master). Can someone suggest me how to merge these two cluster as it is creating performance issues at our end.

Need to meet below two conditions.

  1. No data loss.
  2. No downtime.

Thanks in advance

As you can not simply merge Elasticsearch clusters, it will not be possible to seemlessly merge without downtime.

Before trying to determine the migration process that offers the least amount of downtime it would be useful to have the following questions answered:

  • Which version(s) of Elasticsearch are the two clusters using?
  • How much data do each cluster hold in terns of data size on disk as well as number of indices and shards?
  • What is the hardware specification of the clusters? How much storage do the hold?
  • Do the clusters hold disparate set of indices so that there are no name clashes?
  • How frequently is data updated/deleted and/or added?
  • Do both clusters have the same indexing/query profile or do the serve completely different use cases?
1 Like

Hi @Christian_Dahlqvist,

  1. We are using version 7.7
  2. Cluster one has around 20 tb of data and cluster 2 has around 12 tb of data.
  3. 64gb ram is available on all the node
  4. Cluster 2 has 54 indices with around 167 shards and 240 replica
  5. Cluster 1 has 78 indices with around 215 shards and 584 replica
  6. Yes cluster hold disparate set of indices.
  7. They hold same query profile
    We have around 5 to 6 billion transaction per week.

Thanks in advance

Are you inserting immutable data only or are you also updating data?

We are not updating data. We are fetching logs from a queue and inserting it in ELK.

OK. Then I suspect merging will need to be done as follows:

  1. Take a full snapshot using the snapshot API of cluster 2.
  2. Redirect writing of new data from cluster 2 to cluster 1.
  3. Take a full snapshot of cluster 2 in order to capture the latest data written.
  4. Add new empty nodes to cluster 1 so it has capacity to import the snapshot.
  5. Restore the last cluster 2 snapshot to cluster 1. This may require some indices (or all) to be renamed to avoid clashes with the new indices being written to.

While the snapshot is being restored indexing of new data will work but old data will be temporarily unavailable.

If you can expand cluster 1 without decommissioning cluster 2 you should be able to speed up this process by migrating older indices before redirecting traffic to the new cluster.

2 Likes