Why node uuid replace by {bootstrap-placeholder}-{NodeName}

ES version: 7.6.0

I have 3 master and 3 node in my cluster.

This is the first time i start the cluster,i have configed discovery.seed_hosts and cluster.initial_master_nodes as follow:

discovery.seed_hosts: ipOfMaster1:9300,ipOfMaster2:9300,ipOfMaster3:9300
curl -X GET "localhost:9200/_cluster/state?filter_path=metadata.cluster_coordination.last_committed_config&pretty"

"metadata" : {
    "cluster_coordination" : {
      "last_committed_config" : [
vFexMAR6T32k3qUC2ZBahg: the uuid of master1
ZMZgi3_qSX-soQLwkQZZfA: the uuid of master2
{bootstrap-placeholder}-nameOfMaster3 : {bootstrap-placeholder}-nameOfMaster3

I using GET / to checking the uuid of each node:

master1 and master2 have the same cluster_uuid, but the master3's uuid is na.

I find the following Exception in master3's log:

Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid d5OTAb8FTe-5iIZeHQYYVw than local cluster uuid oP_otesSTv2qItFqBHPPVw, rejecting
  1. Why the uuid of master3 were replaced by {bootstrap-placeholder}-nameOfMaster3?
  2. Why master3 have a different cluster uuid ? is it brain-split?
  3. Is there something wrong with my configuration?

At the time the cluster bootstrapped, it did not know the ID of master 3 so it uses a placeholder that will be replaced with master 3's ID later. This is quite a low-level implementation detail that almost certainly isn't relevant to your problem

You did not set cluster.initial_master_nodes to the same value on all three nodes, or you have changed it after the first time you started these nodes.

Not obviously, but if you changed cluster.initial_master_nodes then that was wrong.

@DavidTurner Thanks for your reply.

Unfortunately,i had set cluster.initial_master_nodes to the same value on all three nodes, and never change them.

I found in the log of master3 that when master3 was started, an exception was thrown in a third-party plugin and the TCP connection was closed. Is this the root cause of this problem?

 exception caught on transport layer [Netty4TcpChannel] , closing connection

No, I don't think a connection failure in a plugin could be related.

One other possibility is that your master nodes do not have storage that persists across restarts.

I saw the description of Auto-bootstrapping in development mode :

If you start an Elasticsearch node without configuring these settings then it will start up in development mode and auto-bootstrap itself into a new cluster. If you start some Elasticsearch nodes on different hosts then by default they will not discover each other and will form a different cluster on each host. Elasticsearch will not merge separate clusters together after they have formed, even if you subsequently try and configure all the nodes into a single cluster. This is because there is no way to merge these separate clusters together without a risk of data loss. You can tell that you have formed separate clusters by checking the cluster UUID reported by GET / on each node. If you intended to form a single cluster then you should start again:

  • Shut down all the nodes.
  • Completely wipe each node by deleting the contents of their data folders.
  • Configure cluster.initial_master_nodes as described above.
  • Restart all the nodes and verify that they have formed a single cluster.

The master3 was mounted with a data disk. I deleted nodes/0/_state on master3 and restarted master3, then master3 joined the cluster.

Is master3 auto-bootstrapping in development mode?

No, auto-bootstrapping is disabled if cluster.initial_master_nodes is set, which you are saying has always been the case.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.