Large data migration failure

Background:
I need to migrate my data from 1 cloud service provider to another, and I pick the solution of add new nodes in current cluster, wait for data copy and rebalance and separate them into 2 clusters.

But after I took the steps below, I met a [CoordinationStateRejectedException], am I missed up some steps here?

Source cluster:
Sys: 8C16G CentOS 7 * 8
ES: 7.4.0
Cluster:
2 master nodes (master: true, data: false)
6 data nodes (master: false, data: true)
Data:
3 billion doc took over 300 GB
Indices: 6 (12s 1r, about 50~60GB per index)

Dest cluster:
Sys: 8C16G CentOS 7 * 8
ES: 7.8.0
Cluster:
2 master nodes (master: true, data: false)
6 data nodes (master: false, data: true)

My steps:

  1. put the master nodes in discovery.seed_hosts in each nodes of dest cluster
discovery.seed_hosts: ["source cluster master1", "source cluster master2", "dest cluster master1", "dest cluster master2", "dest cluster data1"...]
cluster.initial_master_nodes:["dest cluster master1", "dest cluster master2"]
  1. start the nodes in dest cluster and wait for their joint success
  2. change the settings of indices number_of_replicas: 3
  3. wait for all replicas ready
  4. remove the master ip in source cluster from discovery.seed_hosts and add the dest cluster's master nodes' ip in cluster.initial_master_nodes in each of nodes of dest cluster
discovery.seed_hosts: ["dest cluster master1", "dest cluster master2", "dest cluster data1"...]
cluster.initial_master_nodes:["dest cluster master1", "dest cluster master2"]
  1. restart the nodes in dest cluster

But I met the [CoordinationStateRejectedException] here

org.elasticsearch.transport.RemoteTransportException: [dest-node-33][dest-node:9300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid HAlwyQ3lQ8CDIDj1KpUF1Q than local cluster uuid gqStYPHkRQqKhUGdsfRFuA, rejecting

One of the master in dest cluster started success, but all the data nodes can't join the "new" cluster then, do we have solutions on this scenario?

I do not believe this is a supported method of migrating data. As far as I know you can not split a single cluster once it has been formed.

1 Like

cheers Christian,

As I attached before, I assigned 4 nodes be master only (2 in source and 2 in dest cluster), and I suppose each of these 4 nodes can play the master role (only 1 at a time).

Then my plan is:

  1. let the source_master_1 be the master of all 16 nodes (8 in source and 8 in dest)
  2. after all the replicas allocated in all 12 data nodes, remove the source_master_1 from dest cluster nodes (which would let them lost the master node)
  3. but, here the dest cluster still have 2 nodes can be master (dest_master_1 & 2)
  4. after the node leave delay, I suppose the nodes in dest cluster would begin to vote for new master, and, of course we have 2 candidates
  5. no matter which dest_master be the big brother, the dest cluster will get into the data recovery step then

but, I found that the dest cluster didn't recovery as the data nodes in dest cluster can't join the cluster holding by dest_master_1

cheers Steve

actually the data in the big cluster had been evenly distributed, so the latency seems be fine for me till now

I've tried the snapshot, register the repository, create the snapshot, blah blah.... but during the restore process, it seems not able to monitor the process rate of snapshot's restore?

We do have api to get the snapshot creating status "IN_PROGRESS", "SUCCESS", "FAILURE"... and we can check the size of snapshot files...

But there isn't a clear response of restore api but an "accept"

As we all know that the restore process is silence which make me calling the $index/_stat again and again to check if the number of doc/deleted changed

Hmm, you saw my reply? As I thought I deleted it. I had a nice long procedure written for what you wanted to do, but was worried about the risk of data loss, which would then make you feel you get bad advice in the Elastic Forum (I don't work for Elastic) and be mad at everyone. So cool as the idea is, I was wary of outlining how I'd do it (quite a few steps and you'd lose your source as soon as you split).

On restore, progress, I don't know. I'd think you'd see the indexes created as it does and datastore size being used, etc. It's a segment restore, so you can see segment count in your source & destination clusters to give you some idea of how far it's gotten (segments can be different sizes, so it's not a true data %, but would given an idea, I'd think).

yeah, actually your answer will be automated delete soon

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)

No worries about the data lose, I had several plans on this migration, snapshot, elasticsearch-dump, blah blah, actually I've written a blog to introduce the ways to do data migration (in a Chinese tech blog site :smiley: )

This migration, new nodes joint, split into 2 clusters is one of my plans, which I prefer to take is because of its convenience and observability.

The check points in this strategy for me is:

  1. all nodes in both cluster are in a same cluster temporary
  2. the shards have been allocated evenly
  3. (after split), the 2 clusters start to recovery the data
  4. (change the number of replicas into 1) the indices' status back to green
  5. the number of doc/storage in both clusters are the same

If I took the elasticsearch-dump (scroll + bulk insert) or snapshot (+ restore), I need to take more attention on the whole progress, and do trouble shooting if error happened, and if so, it's quite difficult to let other co-workers to take this work over if they weren't so familiar with ES then

So, it will be quite grateful to give me some tips on this migration plan :grin:

If you are okay with data loss risks (i.e. good snapshot) AND the fact that once you split, the source cluster will die & not be restartable due to the way cluster quorums work (this is a big requirement and the only way I know to do this); I'll post my procedure fro for this; yours is not enough and there are some problems that will prevent it from working.

yap, it's okay for me to handle the risk of data losing, 'cuz I've backup the data and have some emergency plans on up to 30% data lost.

So, maybe you can post some guidances or just give me some tips and point out my mistakes?

Your step 3 will not work as you can not split a single cluster into 2. If you want to end up with 2 clusters you need to start a separate cluster and restore a snapshot from the first cluster.

The process you describe might have worked ( not at all sure) in earlier versions but will not work in Elasticsearch 7.x onwards.

1 Like

As Christian nodes, you cannot split a cluster. If you want to do this process AND keep a cluster on both sides, the only way is to create the new one from snapshots; V7 simply will not allow this, on purpose.

My 15+ step procedure works by creating an HA cluster across your clouds (if latency doesn't kill it) and breaking the cross-cloud connection, which will instantly & permanently kill your old cluster as it won't have enough masters to operate.

It's also a complex process involving starting / stopping masters, allocation awareness, etc. and while I think it's cool; it also has lots of important steps and is risky to your current data/cluster due to inter-cloud bandwidth & latency.

Much as I like it, seems not a good idea, and if you want to retain your old cluster, impossible anyway. Suggest you try with the snapshots since you have them anyway :wink:

I think my steps is similar this scenario, though they have some differences.

  1. I started a cluster with 2 master 4 data nodes
  2. These nodes are in two groups, hosted by 2 clouds
  3. The shards & replicas are evenly distributed in all these data nodes
  4. One of the groups dropped, maybe the cross-cloud connection was down or something

So, shouldn't the survivors (1 master 2 data) in both side try to recovery the whole cluster?

It looks like ES might be unable to handle a brain split :frowning:

No. The correct way to handle a split cluster is to ensure that at most one side continues to operate and take new data. If both were allowed to continue operating you would have a split brain problem and silently lose data as Elasticsearch can not merge shards once the cluster comes back together.

In order for any part of the cluster to stay available you need an odd number of master eligible nodes so that one side can have a majority of master nodes and therefore elect a master and continue operating while the other side is not able to form a majority.

Please read this section in the docs for further details.

Even worse, and Christian can correct me if wrong, when you 'separate' a two master cluster, you'll lose BOTH sides as neither has >50% of the original set of 2 masters; this is why having two masters is not a good idea on any system, as losing one can kill the cluster.

So for any 'split' to work, you need, for example, 3 on one side and 2 on the other, then the one with 3 will keep working and the one with 2 will stop. There is no mathematical way in ES V7 to split and keep both sides as for any 'piece' to keep working it must have > 50% of the masters; by definition two pieces cannot do that, hence one or both dies.

1 Like

I have to say I find this entire discussion enlightening and fascinating.... :slight_smile:

And at the same time we are talking 300GB data, 6 indices ... you should be able to migrate that in a ~day or two or so with some decent Logstash and a decent network pipe, little more complicated than that (delta at the end etc) I am sure but just helped someone move 50GB no problem... if they had 6 indices... we would have just moved them 1 at a time or even 2 or more at a time...

2 Likes