Force joining cluster after network error

Three of my nodes were on a different switch, got network problem and now I have two clusters, what should I do to make these three nodes, join the real cluster again?

How many master-eligible nodes do you have? Have you set discovery.zen.minimum_master_nodes correctly according to these guidelines?

All nodes (31 nodes) are master eligible, replica is 0.
oops, I didn't set minimum master nodes.

In that scenario it should be set to 16. I would however probably recommend instead introducing 3 dedicated master nodes, leave the rest as dedicated data nodes and set it to 2. This generally makes it a lot easier to scale out a cluster.

Now, I got two separate clusters, how to make new fake cluster nodes to join real cluster?
Should I restart elastic on three nodes?
I just unplug new fake master, other nodes joined old cluster.
I will change minimum_master_nodes as you said.
Thanks in advance.

Tread carefully here. Any data you wrote while there were two clusters might be lost when you try and merge them together again. There's no guarantee that the data in the "real" cluster will override that in the "fake" one during the merge. It could even result in corrupt indices.

If your "real" cluster is healthy (yellow or green health) then I'd stop all the nodes that formed the "fake" cluster, wipe them, and restart them as empty nodes. At least that way they can't override the "real" data, and then you just have to work out what data needs indexing again because it was sent to the now-wiped "fake" cluster.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.