New node is not getting added to the cluster

Eva · October 31, 2018, 8:17am

Hello there

I have a test cluster on aws ami with one node and i'm trying to add a new node to the same cluster and have installed elasticsearch of same version as in node 1, on an other aws instance.
Below are my node 1 and node 2 configs:

yml file of node 1:
cluster.name: Newcluster
node.name: ThisIsNode1
network.host: 0.0.0.0
http.port: 9200
node.master: true

yml file of node 2(newly added node):
cluster.name: Newcluster
node.name: ThisIsNode2
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["Private ip of node1"]

Cluster health detail when running curl from each node:

Node1:
curl -X GET "private_ip_of_node1:9200/_cluster/health"
{"cluster_name":"Newcluster","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":66,"active_shards":66,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":66,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.0}

Node2:
curl -X GET "private_ip_of_node2:9200/_cluster/health"
{"cluster_name":"Newcluster","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":66,"active_shards":66,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":66,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.0}

telnet from node 1 to node 2 and vice versa works fine both on port 9200 and 9300.

Both the nodes, individually looks good but why is the second node not joining the cluster is my concern here. can anyone please help me out with this. I'm new to Elasticsearch.

Thanks

DavidTurner · October 31, 2018, 8:43am

Since each node is master-eligible (the default for node.master is true) and has discovery.zen.minimum_master_nodes unset, they each can form their own cluster. You should either set one of the nodes not to be master-eligible or else increase discovery.zen.minimum_master_nodes to 2.

I'm also suspicious that each node already has 66 active shards on it. A newly-added node should have no shards. Did you copy the data from one node to the other? This won't work. The newly-added node must start with an empty data directory.

Eva · October 31, 2018, 9:01am

Hi David, Thank you for the response.

I will change the setting as you mentioned.

Node 1 has ELK installed and i'm pushing cloudtrail logs to it. I had taken an AMI of node 1 and launched the second node using this AMI, just that i had uninstalled logstash and kibana on node 2. Apart form this i haven't done any data copy. Do i need to create a new node without using the AMI?
Thanks

DavidTurner · October 31, 2018, 9:09am

I think that taking an AMI could indeed explain how the data has been copied. Ideally you should avoid taking a copy of the data like this (e.g. keep it on a separate EBS volume) but if you have done so then you should delete the data directory from the newly-launched instance before starting Elasticsearch.

Note the warning about taking filesystem snapshots: the copy in the AMI might be inconsistent or corrupt, and might be undetectably so.

Eva · October 31, 2018, 9:14am

Thank you!

I will create node 2 on a new machine, change the settings and update this thread.

thanks again

Eva · October 31, 2018, 10:24am

Okay, i have now configured ES on new node and below is the detail:

cluster.name: NewCluster
node.name: ThisIsNode1
network.host: 0.0.0.0
http.port: 9200
discovery.zen.minimum_master_nodes: 2

cluster.name: NewCluster
node.name: ThisIsNode2
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["private_ip_of_node1"]
discovery.zen.minimum_master_nodes: 2

telnet (on 9200 and 9300)and ping work fine between both the nodes

curl -X GET "private_ip_of_node1:9200/_cluster/health"
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

curl -X GET "private_ip_of_node2:9200/_cluster/health"
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

Node1 logs:
[2018-10-31T10:10:56,663][WARN ][o.e.d.z.ZenDiscovery ] [ThisIsNode1] not enough master nodes discovered during pinging (found [[Candidate{node={ThisIsNode1}{OJHv2uneTbGYvYfI_u3v2A}{q97ISha1Q4e3WzIe2Gea1g}{PRIVATE_IP_OF_NODE1}{PRIVATE_IP_OF_NODE1:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-10-31T10:10:59,671][WARN ][o.e.d.z.ZenDiscovery ] [ThisIsNode1] not enough master nodes discovered during pinging (found [[Candidate{node={ThisIsNode1}{OJHv2uneTbGYvYfI_u3v2A}{q97ISha1Q4e3WzIe2Gea1g}{PRIVATE_IP_OF_NODE1}{PRIVATE_IP_OF_NODE1:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again

Node2 logs:
[2018-10-31T10:07:10,099][WARN ][o.e.d.z.ZenDiscovery ] [ThisIsNode2] not enough master nodes discovered during pinging (found [[Candidate{node={ThisIsNode2}{zu6mVxXmTJue0--D1GmfNw}{7BD594t7S86x-oNrR-NVkw}{PRIVATE_IP_OF_NODE2}{PRIVATE_IP_OF_NODE2:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2018-10-31T10:07:13,100][WARN ][o.e.d.z.ZenDiscovery ] [ThisIsNode2] not enough master nodes discovered during pinging (found [[Candidate{node={ThisIsNode2}{zu6mVxXmTJue0--D1GmfNw}{7BD594t7S86x-oNrR-NVkw}{PRIVATE_IP_OF_NODE2}{PRIVATE_IP_OF_NODE2:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again

What am I missing here. Please help.

Thanks

DavidTurner · October 31, 2018, 11:02am

This should mean that node 2 will be attempting to connect to node 1, but the output you're showing indicates that these attempts are not succeeding. Are there any log messages about why this is failing?

If no, maybe try setting logger.org.elasticsearch.discovery.zen: TRACE so we can see what's going on at a lower level.

Eva · October 31, 2018, 11:36am

I just stopped the old instance(the second node that was created using ami of first one) as i could see it interfering in the logs even after the changing its cluster name in yml, and the nodes have now joined the cluster:

Below is the result of curl command for checking cluster health:

node1:
{"cluster_name":"name","status":"green","timed_out":false,"number_of_nodes":2,"number_of_data_nodes":2,"active_primary_shards":66,"active_shards":132,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}

node 2:
{"cluster_name":"name","status":"green","timed_out":false,"number_of_nodes":2,"number_of_data_nodes":2,"active_primary_shards":66,"active_shards":132,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}

Wondering why/how the no of shards are updated on node2. Is this expected after a new node joins the cluster?

Thanks!

DavidTurner · October 31, 2018, 11:41am

Great, it looks to be working.

Yes, Elasticsearch will allocate replicas to the new node and copy the data across automatically.

Eva · October 31, 2018, 12:21pm

Thanks for your help David

system · November 28, 2018, 12:21pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Adding a new node Elasticsearch	2	438	July 5, 2017
The new node cannot add to the cluster but creating a separate new cluster instead Elasticsearch ccs-cross-cluster-search	6	1001	August 9, 2021
New node not getting added Elasticsearch	4	462	July 5, 2017
Cannot add a new node in a cluster already composed of 1 node Elasticsearch	5	390	July 17, 2020
Add 2nd node to cluster Elasticsearch	3	450	July 5, 2020

New node is not getting added to the cluster

Related topics