New node is not getting added to the cluster

Hello there

I have a test cluster on aws ami with one node and i'm trying to add a new node to the same cluster and have installed elasticsearch of same version as in node 1, on an other aws instance.
Below are my node 1 and node 2 configs:

yml file of node 1:
cluster.name: Newcluster
node.name: ThisIsNode1
network.host: 0.0.0.0
http.port: 9200
node.master: true

yml file of node 2(newly added node):
cluster.name: Newcluster
node.name: ThisIsNode2
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["Private ip of node1"]

Cluster health detail when running curl from each node:

Node1:
curl -X GET "private_ip_of_node1:9200/_cluster/health"
{"cluster_name":"Newcluster","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":66,"active_shards":66,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":66,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.0}

Node2:
curl -X GET "private_ip_of_node2:9200/_cluster/health"
{"cluster_name":"Newcluster","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":66,"active_shards":66,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":66,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.0}

telnet from node 1 to node 2 and vice versa works fine both on port 9200 and 9300.

Both the nodes, individually looks good but why is the second node not joining the cluster is my concern here. can anyone please help me out with this. I'm new to Elasticsearch.

Thanks

Since each node is master-eligible (the default for node.master is true) and has discovery.zen.minimum_master_nodes unset, they each can form their own cluster. You should either set one of the nodes not to be master-eligible or else increase discovery.zen.minimum_master_nodes to 2.

I'm also suspicious that each node already has 66 active shards on it. A newly-added node should have no shards. Did you copy the data from one node to the other? This won't work. The newly-added node must start with an empty data directory.

3 Likes

Hi David, Thank you for the response.

I will change the setting as you mentioned.

Node 1 has ELK installed and i'm pushing cloudtrail logs to it. I had taken an AMI of node 1 and launched the second node using this AMI, just that i had uninstalled logstash and kibana on node 2. Apart form this i haven't done any data copy. Do i need to create a new node without using the AMI?
Thanks

I think that taking an AMI could indeed explain how the data has been copied. Ideally you should avoid taking a copy of the data like this (e.g. keep it on a separate EBS volume) but if you have done so then you should delete the data directory from the newly-launched instance before starting Elasticsearch.

Note the warning about taking filesystem snapshots: the copy in the AMI might be inconsistent or corrupt, and might be undetectably so.

Thank you!

I will create node 2 on a new machine, change the settings and update this thread.

thanks again

1 Like

Okay, i have now configured ES on new node and below is the detail:

cluster.name: NewCluster
node.name: ThisIsNode1
network.host: 0.0.0.0
http.port: 9200
discovery.zen.minimum_master_nodes: 2

cluster.name: NewCluster
node.name: ThisIsNode2
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["private_ip_of_node1"]
discovery.zen.minimum_master_nodes: 2

telnet (on 9200 and 9300)and ping work fine between both the nodes

curl -X GET "private_ip_of_node1:9200/_cluster/health"
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

curl -X GET "private_ip_of_node2:9200/_cluster/health"
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

Node1 logs:
[2018-10-31T10:10:56,663][WARN ][o.e.d.z.ZenDiscovery ] [ThisIsNode1] not enough master nodes discovered during pinging (found [[Candidate{node={ThisIsNode1}{OJHv2uneTbGYvYfI_u3v2A}{q97ISha1Q4e3WzIe2Gea1g}{PRIVATE_IP_OF_NODE1}{PRIVATE_IP_OF_NODE1:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-10-31T10:10:59,671][WARN ][o.e.d.z.ZenDiscovery ] [ThisIsNode1] not enough master nodes discovered during pinging (found [[Candidate{node={ThisIsNode1}{OJHv2uneTbGYvYfI_u3v2A}{q97ISha1Q4e3WzIe2Gea1g}{PRIVATE_IP_OF_NODE1}{PRIVATE_IP_OF_NODE1:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again

Node2 logs:
[2018-10-31T10:07:10,099][WARN ][o.e.d.z.ZenDiscovery ] [ThisIsNode2] not enough master nodes discovered during pinging (found [[Candidate{node={ThisIsNode2}{zu6mVxXmTJue0--D1GmfNw}{7BD594t7S86x-oNrR-NVkw}{PRIVATE_IP_OF_NODE2}{PRIVATE_IP_OF_NODE2:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-10-31T10:07:13,100][WARN ][o.e.d.z.ZenDiscovery ] [ThisIsNode2] not enough master nodes discovered during pinging (found [[Candidate{node={ThisIsNode2}{zu6mVxXmTJue0--D1GmfNw}{7BD594t7S86x-oNrR-NVkw}{PRIVATE_IP_OF_NODE2}{PRIVATE_IP_OF_NODE2:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again

What am I missing here. Please help.

Thanks

This should mean that node 2 will be attempting to connect to node 1, but the output you're showing indicates that these attempts are not succeeding. Are there any log messages about why this is failing?

If no, maybe try setting logger.org.elasticsearch.discovery.zen: TRACE so we can see what's going on at a lower level.

I just stopped the old instance(the second node that was created using ami of first one) as i could see it interfering in the logs even after the changing its cluster name in yml, and the nodes have now joined the cluster:

Below is the result of curl command for checking cluster health:

node1:
{"cluster_name":"name","status":"green","timed_out":false,"number_of_nodes":2,"number_of_data_nodes":2,"active_primary_shards":66,"active_shards":132,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}

node 2:
{"cluster_name":"name","status":"green","timed_out":false,"number_of_nodes":2,"number_of_data_nodes":2,"active_primary_shards":66,"active_shards":132,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}

Wondering why/how the no of shards are updated on node2. Is this expected after a new node joins the cluster?

Thanks!

Great, it looks to be working.

Yes, Elasticsearch will allocate replicas to the new node and copy the data across automatically.

1 Like

Thanks for your help David :slight_smile:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.