Unable to join nodes to cluster

I'm trying to setup a cluster of 3 elastic nodes.
1 master and two data nodes.
I'm able to start all 3 of them yet they seem to setup there own cluster instead of joining together.

Node 1 (the master)
192.168.56.114
node2
192.168.56.113
node3
192.168.56.115
Below the config of each node:

cluster.name: TestCluster
node.name: node1
node.master: true
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.56.114
discovery.seed_hosts: ["192.168.56.113", "192.168.56.115"]
cluster.initial_master_nodes: ["192.168.56.114"]
xpack.security.enabled: false
cluster.name: TestCluster
node.name: node2
node.data: true
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.56.113
discovery.seed_hosts: ["192.168.56.114", "192.168.56.115"]
cluster.initial_master_nodes: ["192.168.56.114"]
xpack.security.enabled: false
cluster.name: TestCluster
node.name: node3
node.data: true
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.56.115
discovery.seed_hosts: ["192.168.56.113", "192.168.56.114"]
cluster.initial_master_nodes: ["192.168.56.114"]
xpack.security.enabled: false

When I run a curl +X GET "192.168.56.114:9200/_cluster/health?pretty"

{
  "cluster_name" : "TestCluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

curl +X GET "192.168.56.113:9200/_cluster/health?pretty"

{
  "cluster_name" : "TestCluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Only one node is active in the cluster, this is the same if I run the curl on a different node they all seem to be setting up there own cluster with one node.
If I set the two data notes to node.master: false they still fail to join the cluster of the master node.
if I run an nmap scan I can see that 9200 is open on all 3 servers.

Is anyone able to tell me what I missed?
If you need any more info, please let me know.

Welcome to the community!

I'm guessing these are three VMs (not Docker, or some other funny network situation). Note for my own sanity I'd set IPs in ascending order by node (not 114, 113, 115), but doesn't really matter.

Note1 - Why is your seed hosts 113/115 but your init master yourself on 114, etc. and so on for others? Why not just set all to all 3 IPs, as the doc says "This setting should be a list of the addresses of all the master-eligible nodes in the cluster." So set to the full IPs of 113, 114, 115.

Initial master nodes is usually a list of node NAMES, not IPs (yes, it's confusing, and IPs will work, but most examples uses names). So ideally set to node1, node2, node3.

Also, be sure to set node.master and node.data explicitly so it's clear - they default to true! You are not setting it on node2/3 but they will both also be master-eligible. And node1 will default to data node, also, as node.data is true by default.

And nodes talk on port 9300, not 9200 (which is for REST data APIs, not node talking/transport).), so verify that's open on the machine's IP (not localhost only). All nodes generally listen on both 9200/9300.

Hi Steve,

Thanks for your reply.
Yes i'm using virtual box running three Debian servers for my test.

Based on your feedback I've changed the config file to the following:

cluster.name: TestCluster
node.name: node1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.56.113
discovery.seed_hosts: ["192.168.56.113", "192.168.56.114", "192.168.56.115"]
cluster.initial_master_nodes: ["node1","node2","node3"]
xpack.security.enabled: false
cluster.name: TestCluster
node.name: node2
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.56.114
discovery.seed_hosts: ["192.168.56.113", "192.168.56.114", "192.168.56.115"]
cluster.initial_master_nodes: ["node1","node2","node3"]
xpack.security.enabled: false
cluster.name: TestCluster
node.name: node3
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.56.115
discovery.seed_hosts: ["192.168.56.113", "192.168.56.114", "192.168.56.115"]
cluster.initial_master_nodes: ["node1","node2","node3"]
xpack.security.enabled: false

However they still seem to setup there own individual clusters instead of joining into one cluster.

results of curl:

curl +X GET "192.168.56.113:9200/_cluster/health?pretty"

{
  "cluster_name" : "TestCluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

curl +X GET "192.168.56.114:9200/_cluster/health?pretty"

{
  "cluster_name" : "TestCluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 6,
  "active_shards" : 6,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

curl +X GET "192.168.56.115:9200/_cluster/health?pretty"

{
  "cluster_name" : "TestCluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

An nmap scan shows me that all the servers have port 9200 and 9300 open.

I also tried a setup where I explicitly state that that one is an eligible master and the other is not.
If I understand it correctly I only need to specify the master node in the discovery.seed_hosts by ip and in cluster.initial_master_nodes by name?

cluster.name: TestCluster
node.name: node1
node.master: true
node.data: false
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.56.113
discovery.seed_hosts: ["192.168.56.113"]
cluster.initial_master_nodes: ["node1"]
xpack.security.enabled: false
cluster.name: TestCluster
node.name: node2
node.master: false
node.data: true
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.56.114
discovery.seed_hosts: ["192.168.56.113"]
cluster.initial_master_nodes: ["node1"]
xpack.security.enabled: false

This results in the same issue, after the second node is up and running it doesn't join the cluster of the master node.

I'm lost then, it makes no sense for them to create their own cluster with this initial setting; it's the whole point of it, BUT if they already formed clusters, you can't undo that; this setting is only for new clusters, i.e. you have to erase their data directories so they start fresh, them bring them all up at the same time and they should vote for a new cluster as soon as two are up.

cluster.initial_master_nodes: ["node1","node2","node3"]

If I understand it correctly I only need to specify the master node in the discovery.seed_hosts by ip and in cluster.initial_master_nodes by name?

Correct. In this case, all three in both, I'd think (if master is true on all three).

For your last case, of node2 not joining, what does it do or what logs? As above, if it's ALREADY formed it's own cluster you must stop it, empty /var/lib/elasticsearch, then start it and it should join or tell you why. With master false it sure should join that node1 cluster from what I can see.

Sorry I can't help more, as seems simple enough.

1 Like

Thanks, removing the data in /var/lib/elasticsearch and re-installing it fixed the issue.
after starting all 3 nodes at the same time they joined together as one cluster :slight_smile:

Great and yeah, once a cluster forms like that, even on one node you have to purge & restart it from scratch as it remembers and won't join another cluster.

Glad it's working now.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.