Not able to join a node in cluster

rangitgaddam · November 9, 2017, 5:37am

Hi i am not able to join simple data node(running in another server) to existing cluster(master running in another server) with in a network please check my config files below

elasticserach.yml(for master node)

cluster.name: kafka-connector
node.name: node-1
node.master : true
node.data : false
path.data: /home/user/Kafka-ELK/KafkaElkConnector/data/elasticsearch
path.logs: /home/user/Kafka-ELK/KafkaElkConnector/data/elasticsearch-logs
bootstrap.memory_lock: true
network.host: "10.0.0.103"
http.port: 9200
transport.host: localhost
transport.tcp.port: 9300
discovery.zen.ping.unicast.hosts: ["10.0.0.103","10.0.0.114"]

elasticsearch.yml(in another server for data node)

cluster.name: kafka-connector

node.name: node-2
node.master : false
node.data : true

path.data: /home/user/Kafka-ELK/KafkaElkConnector/data/elasticsearch

path.logs: /home/user/Kafka-ELK/KafkaElkConnector/data/elasticsearch-logs
bootstrap.memory_lock: true
network.host: "10.0.0.114"
http.port: 9201

transport.host: localhost
transport.tcp.port: 9301
discovery.zen.ping.unicast.hosts: ["10.0.0.103"]

My cluster health is not showing data nodes

{
"cluster_name" : "kafka-connector",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 0,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 12,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 0.0
}

zqc0512 · November 9, 2017, 6:05am

discovery.zen.ping.unicast.hosts: ["10.0.0.103","10.0.0.114"]
this is need add in node-2.
and transport.host: localhost need not use localhost.

rangitgaddam · November 9, 2017, 6:08am

Do you mean that transport.host property is no more need?

How come that will be possible without transport.host and the node is getting closed automatically if i start the node with other than localhost in transport.host please check out the logs below

[2017-11-09T11:44:37,142][INFO ][o.e.b.BootstrapChecks ] [node-1] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-11-09T11:44:37,144][ERROR][o.e.b.Bootstrap ] [node-1] node validation exception
[3] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2]: memory locking requested for elasticsearch process but memory is not locked
[3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2017-11-09T11:44:37,145][INFO ][o.e.n.Node ] [node-1] stopping ...
[2017-11-09T11:44:37,159][INFO ][o.e.n.Node ] [node-1] stopped
[2017-11-09T11:44:37,159][INFO ][o.e.n.Node ] [node-1] closing ...
[2017-11-09T11:44:37,165][INFO ][o.e.n.Node ] [node-1] closed

zqc0512 · November 9, 2017, 6:23am

[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2]: memory locking requested for elasticsearch process but memory is not locked
[3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
this output 3 error,
you need change it in system...

dadoonet · November 9, 2017, 6:27am

You have different problems:

If you change transport.tcp.port: 9301, then you need to change discovery.zen.ping.unicast.hosts: ["10.0.0.103","10.0.0.114:9301"].

But keep the maximum default values as possible. Here I'd simplify the configuration with:

Node 1:

cluster.name: kafka-connector
node.name: node-1
node.master : true
node.data : false
path.data: /home/user/Kafka-ELK/KafkaElkConnector/data/elasticsearch
path.logs: /home/user/Kafka-ELK/KafkaElkConnector/data/elasticsearch-logs
bootstrap.memory_lock: true
network.host: "10.0.0.103"
discovery.zen.ping.unicast.hosts: ["10.0.0.103"]

And Node2:

cluster.name: kafka-connector
node.name: node-2
node.master : false
node.data : true
path.data: /home/user/Kafka-ELK/KafkaElkConnector/data/elasticsearch
path.logs: /home/user/Kafka-ELK/KafkaElkConnector/data/elasticsearch-logs
bootstrap.memory_lock: true
network.host: "10.0.0.114"
discovery.zen.ping.unicast.hosts: ["10.0.0.103"]

Note that the data only node does not have to be declared in discovery.zen.ping.unicast.hosts. Only master eligible nodes needs to be declared there.

Then the last problem is bootstrap.

Please read https://www.elastic.co/guide/en/elasticsearch/reference/current/bootstrap-checks.html

rangitgaddam · November 9, 2017, 12:20pm

@dadoonet @zqc0512 OOPS after so much time of debugging i made it to start working the Cluster and able to add the nodes.

zqc0512 · November 10, 2017, 1:45am

see the three error. this is configure by system with /etc/security/limits.conf and use systcl cmd to change it

dzyubanv · November 17, 2017, 3:45pm

Hi Ranjith,
I'm experiencing the same issue you resolved alredy.
Just started to use es recently on two RHEL servers.
Could you please share your two final yml files and what else did you, may be on the system level, to make your two nodes cluster work properly.
Appreciate your help.

rangitgaddam · November 20, 2017, 11:01am

What is the issue you are facing so that will be clear to me.

dzyubanv · November 20, 2017, 12:16pm

Im running two nodes cluster on separate RHEL servers. One node is with master and data roles, second one is with data role.Nodes started but are not joining the cluster. When I commented out the "transport.host: localhost" then starting first node failed and es log file shows that it started and then failed to pass the bootstrap process:
-max file des riptor 4096 increase to 65536
-max virtual .emory vm.max_map_count 65530 increase to 262144
-system call filters failed to install, fix or disable it
Could you tell me please:
*what exactly did you do to pass the es bootstrap process and disappear these 3 messages ?
*or may be you changed something else ?
*my 2 yml files now are like yours initial yml files, but to my understanding you changed them too, so what are your modified yml files, allowing es cluster work ok ?
*which operating system your es instances are running on ?
Thanks in advance

rangitgaddam · November 20, 2017, 1:12pm

Add bootstrap.system_call_filter: false to your both yml files and give only the master node host in the property of discovery.zen.ping.unicast.hosts in both the files and do not the change the default ports of _transport.port let it be 9300 and transport. host must point to your system IP address of your system (for example 10.0.0.101)

increase the limits by running the following command as root
sysctl -w vm.max_map_count=262144

this is configure by system with /etc/security/limits.conf and use systcl cmd to change it and add the below lines at the end of the file

*** soft nofile 64000**
*** hard nofile 64000**
*** soft nproc 63000**
*** hard nproc 63000**

yourusername soft nofile 65536
yourusername hard nofile 65536
** noproc 2048**

Make sure to start elasticsearch node by the username given in /etc/security/limits.conf

exit the CLI sesion and relogin if you are accessing virtual machine otherwise reboot the system for applying changes.

Thanks.

dzyubanv · November 20, 2017, 11:42pm

Thanks for the response and help Ranjith.
Your guidelines are very valuable.
Will apply them and test es multi nodes cluster.

zqc0512 · November 21, 2017, 6:18am

change in /etc/security/limits.conf need reboot machine. as i know.

dzyubanv · November 21, 2017, 11:56am

Will do it. Thanks.

dzyubanv · November 27, 2017, 9:54pm

Following your advises guys, I'm able run two nodes cluster on separate rhel server. The only one thing I'm not sure about is the bootstrap.memory_lock: true in elasticsearch.yml files. If I commented it out at all then my multinode cluster works well. If I uncomment it then the node is not passing the bootstrap check validation. The ERROR is bootstrap checks failed [1]:memory locking requested for elasticsesrch provess but memory is not locked...stopped...closed.
Should I comment the bootstrap.memory_lock: true" out in prod or it is needed and something else need to be done on system level?
Appreciate your advise.

system · December 25, 2017, 9:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES 2.3.3 node cannot join the master node Elasticsearch	5	2139	July 5, 2017
Data nodes are not able to join master, failed to send join request to master Elasticsearch	2	879	February 25, 2019
Node is not joining the cluster (ES-5.6.3) Elasticsearch	7	1924	December 14, 2017
Elasticsearch cluster: node not able to connect to cluster Elasticsearch	1	847	July 5, 2017
Data node can't find the master on elasticsearch cluster Elasticsearch	2	1845	November 30, 2020

Not able to join a node in cluster

Related topics