Elastic node reads previous cluster uuid from data folder even if it is re-configured to join another cluster

Here's the case,

In brief :
On bootstrapping a single node, it forms a single node cluster and stores the cluster uuid in its data folder. Now even if it is re-configured to join a multi-node cluster in its yaml file. It will still refer that old cluster uuid from data folder and will not join the new cluster.

In detail :
Create a single node (say node-x) with below configuration and set it up,

cluster.name: elasticsearch_test
node.name: node-x
path.data: C:\ProgramData\Elastic\Elasticsearch\data
path.logs: C:\ProgramData\Elastic\Elasticsearch\logs
http.port: 9200
network.host: 127.0.0.1
transport.tcp.port: 9300

Now set up 2 more nodes (node-2 and node-3) with below configurations to form a cluster,

cluster.name: elasticsearch_test
node.name: node-2
path.data: C:\ProgramData\Elastic\Elasticsearch_node_2\data
path.logs: C:\ProgramData\Elastic\Elasticsearch_node_2\logs
http.port: 9201
network.host: 127.0.0.1
discovery.seed_hosts: ["127.0.0.1:9302","127.0.0.1:9300","127.0.0.1:9301"]
cluster.initial_master_nodes: ["node-x", "node-2", "node-3"]
transport.tcp.port: 9301

You will observe, node-x is not included in the cluster elasticsearch_test yet,
http://localhost:9201/_cat/nodes

Now update the configuration on node-x with below properties and restart node-x to make it join the elasticsearch_test cluster.

discovery.seed_hosts: ["127.0.0.1:9302","127.0.0.1:9300","127.0.0.1:9301"]
cluster.initial_master_nodes: ["node-x", "node-2", "node-3"]

You will observe, node-x has form its own cluster referring the uuid of its last cluster configuration from data folder and haven't joined cluster elasticsearch_test.

Solution : Now delete the contents of data folder of node-x and restart node-x. It will join the cluster elasticsearch_test.

As per Elasticsearch document 7.x :
[Bootstrapping a cluster | Elasticsearch Guide [7.16] | Elastic](http://Bootstrapping a cluster 7.x)

You must set cluster.initial_master_nodes to the same list of nodes on each node on which it is set in order to be sure that only a single cluster forms during bootstrapping and therefore to avoid the risk of data loss.

I think you can use this tool: https://www.elastic.co/guide/en/elasticsearch/reference/7.x/node-tool.html#node-tool-detach-cluster

Never tried it yet though :wink:

This is not the right time to use elasticsearch-node. It's only appropriate to use it like this when you have experienced a disaster and have no other options, such as restoring from a snapshot.

The process described above is broken. It might have seemed to have worked in earlier versions, but there was a risk of data loss when doing what you describe. You are forming multiple clusters, and Elasticsearch is now rightly refusing to merge them together later. The solution is quoted above:

You must set cluster.initial_master_nodes to the same list of nodes on each node on which it is set in order to be sure that only a single cluster forms during bootstrapping and therefore to avoid the risk of data loss.

You are not setting cluster.initial_master_nodes to the same list of nodes on each node, because the first time you start up node-x it is not set.

Sorry, it's actually a little more subtle than that. You are starting this first node up in development mode, which causes it to set cluster.initial_master_nodes itself. However the fix is still the same: if you want to be sure to form a single cluster, you should set cluster.initial_master_nodes explicitly on every master-eligible node until the cluster has formed.

Many thanks @DavidTurner for fixing my lack of knowledge on this. :hugs:

1 Like

Because first time I want to make node-x join any other cluster than why would I set

cluster.initial_master_nodes

and my question here is very simple. Why the node-x is referring cluster details from data folder rather than reading the configuration file again ?

It would be a very bad idea to read the config file again, because this would let you perform the unsafe sequence of steps you describe in your original post and potentially lose data as a result.

cluster.initial_master_nodes is only needed the first time you start up a cluster. After that, it is ignored. You might be interested in the section entitled Safety First in this blog post for more information.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.