Getting failed to send join request to master


(TheRebel) #1

Hi All,

I have installed elasticsearch on three machines ["10.135.144.XX","10.182.197.XX","10.182.197.XX"] where 10.135.144.XX is master node and other 197.XX are date nodes.
Whenever I am trying to make elasticsearch cluster I am getting connection error as given in below console snapshot. Please let me know what I am missing here.

Master elasticsearch.yml (10.135.144.XX)

bootstrap.system_call_filter: false
cluster.name: swd_cluster
node.name: swd_master
node.master: true
node.data: false
action.auto_create_index: ".security*,.monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*"
network.host: 0.0.0.0.
transport.tcp.compress: true
transport.tcp.port: 9300
discovery.zen.minimum_master_nodes: 1
network.publish_host: 10.135.144.XX
network.bind_host: 10.135.144.XX
path.data: /home/elasticsearch-6.3.2/data
path.logs: /home/elasticsearch-6.3.2/logs
discovery.zen.ping.unicast.hosts: ["10.135.144.XX","10.182.197.XX","10.182.197.XX"]

Node 1 elasticsearch.yml (10.182.197.XX)

cluster.name: swd_cluster
node.name: swd_node1
node.master: false
node.data: true
path.data: /var/work/elasticsearch-6.3.2/data
path.logs: /var/work/elasticsearch-6.3.2/logs
discovery.zen.ping.unicast.hosts: ["10.135.144.XX","10.182.197.XX","10.182.197.XX"]
bootstrap.system_call_filter: false
action.auto_create_index: ".security*,.monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*"
network.host: 0.0.0.0
transport.tcp.compress: true
transport.tcp.port: 9300
discovery.zen.minimum_master_nodes: 1
network.publish_host: 10.182.197.XX
network.bind_host: 10.182.197.XX

Node 2 elasticsearch.yml (10.182.197.XX)

cluster.name: swd_cluster
node.name: swd_node2
node.master: false
node.data: true
path.data: /var/work/elasticsearch-6.3.2/data
path.logs: /var/work/elasticsearch-6.3.2/logs
discovery.zen.ping.unicast.hosts: ["10.135.144.XX", "10.182.197.XX", "10.182.197.XX"]
bootstrap.system_call_filter: false
action.auto_create_index: ".security*,.monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*"
network.host: 0.0.0.0
transport.tcp.compress: true
transport.tcp.port: 9300
discovery.zen.minimum_master_nodes: 1
network.publish_host: 10.182.197.XX
network.bind_host: 10.182.197.XX

Master Node console

[2018-08-06T15:45:44,188][INFO ][o.e.c.s.MasterService ] [swd_master] zen-disco-elected-as-master ([0] nodes joined)[, ], reason: new_master {swd_master}{o_wGv_i1RRuKyD04AHKAWg}{x6a0-Wl_TE29B7KFMu7TrQ}{10.135.144.XX}{10.135.144.XX:9300}{ml.machine_memory=8253767680, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
[2018-08-06T15:45:44,196][INFO ][o.e.c.s.ClusterApplierService] [swd_master] new_master {swd_master}{o_wGv_i1RRuKyD04AHKAWg}{x6a0-Wl_TE29B7KFMu7TrQ}{10.135.144.XX}{10.135.144.XX:9300}{ml.machine_memory=8253767680, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, reason: apply cluster state (from master [master {swd_master}{o_wGv_i1RRuKyD04AHKAWg}{x6a0-Wl_TE29B7KFMu7TrQ}{10.135.144.XX}{10.135.144.XX:9300}{ml.machine_memory=8253767680, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)[, ]]])
[2018-08-06T15:45:44,214][INFO ][o.e.x.s.t.n.SecurityNetty4HttpServerTransport] [swd_master] publish_address {10.135.144.XX:9200}, bound_addresses {10.135.144.XX:9200}
[2018-08-06T15:45:44,214][INFO ][o.e.n.Node ] [swd_master] started
[2018-08-06T15:45:44,457][WARN ][o.e.x.s.a.s.m.NativeRoleMappingStore] [swd_master] Failed to clear cache for realms [[]]
[2018-08-06T15:45:44,474][INFO ][o.e.l.LicenseService ] [swd_master] license [b1695c81-aa84-490f-a81b-8f8b3ff9a16c] mode [basic] - valid
[2018-08-06T15:45:44,483][INFO ][o.e.g.GatewayService ] [swd_master] recovered [0] indices into cluster_state

NOTE:- There is no information coming on master node about the data nodes node1 and node2 connection

Node 1 Console

[2018-08-06T15:48:42,590][INFO ][o.e.d.z.ZenDiscovery ] [swd_node1] failed to send join request to master [{swd_master}{o_wGv_i1RRuKyD04AHKAWg}{x6a0-Wl_TE29B7KFMu7TrQ}{10.135.144.XX}{10.135.144.XX:9300}{ml.machine_memory=8253767680, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], reason [RemoteTransportException[[swd_master][10.135.144.XX:9300][internal:discovery/zen/join]]; nested: ConnectTransportException[[swd_node1][10.182.197.XX:9300] connect_exception]; nested: IOException[Connection timed out: 10.182.197.XX/10.182.197.XX:9300]; nested: IOException[Connection timed out]; ]
[2018-08-06T15:49:06,801][INFO ][o.e.d.z.ZenDiscovery ] [swd_node1] failed to send join request to master [{swd_master}{o_wGv_i1RRuKyD04AHKAWg}{x6a0-Wl_TE29B7KFMu7TrQ}{10.135.144.XX}{10.135.144.XX:9300}{ml.machine_memory=8253767680, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], reason [RemoteTransportException[[swd_master][10.135.144.XX:9300][internal:discovery/zen/join]]; nested: ConnectTransportException[[swd_node1][10.182.197.XX:9300] connect_exception]; nested: IOException[Connection timed out: 10.182.197.XX/10.182.197.XX:9300]; nested: IOException[Connection timed out]; ]

Node 2 Console

[2018-08-06T15:49:20,615][INFO ][o.e.d.z.ZenDiscovery ] [swd_node2] failed to send join request to master [{swd_master}{o_wGv_i1RRuKyD04AHKAWg}{x6a0-Wl_TE29B7KFMu7TrQ}{10.135.144.XX}{10.135.144.XX:9300}{ml.machine_memory=8253767680, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], reason [RemoteTransportException[[swd_master][10.135.144.XX:9300][internal:discovery/zen/join]]; nested: ConnectTransportException[[swd_node2][10.182.197.XX:9300] connect_exception]; nested: IOException[Connection timed out: 10.182.197.XX/10.182.197.XX:9300]; nested: IOException[Connection timed out]; ]
[2018-08-06T15:49:44,675][INFO ][o.e.d.z.ZenDiscovery ] [swd_node2] failed to send join request to master [{swd_master}{o_wGv_i1RRuKyD04AHKAWg}{x6a0-Wl_TE29B7KFMu7TrQ}{10.135.144.XX}{10.135.144.XX:9300}{ml.machine_memory=8253767680, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true}], reason [RemoteTransportException[[swd_master][10.135.144.XX:9300][internal:discovery/zen/join]]; nested: ConnectTransportException[[swd_node2][10.182.197.XX:9300] connect_exception]; nested: IOException[Connection timed out: 10.182.197.XX/10.182.197.XX:9300]; nested: IOException[Connection timed out]; ]

NOTE: I have tried all the solutions given on this forum and google but still I am facing the issue. So please help me.


(TheRebel) #2

Anyone active on this forum ??


(David Pilato) #3

Read this and specifically the "Also be patient" part.

It's fine to answer on your own thread after 2 or 3 days (not including weekends) if you don't have an answer.


(TheRebel) #4

Thanks for your kind reply. Do you mean people should wait for 2-3 days to get an answer ??


(David Pilato) #5

Sometimes more. May be you did not read yet the link I pasted but here we go:

Also be patient when waiting for an answer to your questions. This is a community forum and as such it may take some time before someone replies to your question. Not everyone on the forum is an expert in every area so you may need to wait for someone who knows about the area you are asking about to come online and have the time to look into your problem.

Please see the code of conduct for more details on our code of conduct (in particular the "be patient" section).

There are no SLAs on responses to questions posted on this forum, if you require help with an SLA on responses you should look into purchasing a subscription package that includes support with an SLA such as those offered by Elastic.


(TheRebel) #6

Thanks a lot. I think it's my bad that I was so excited about elastic without having patience.


(TheRebel) #7

Can anyone tell how to delete the question.


(David Pilato) #8

I can delete your post. You can just "flag it" and I'll do that.

Coming back to your question, most likely there is a network communication problem between your nodes on port 9300. Most of the time there is a firewall somewhere or network rules that don't allow the traffic on port 9300.


(TheRebel) #9

Thanks @dadoonet was frustrated so I messed it up. Somehow I have solved the issue and elasticsearch is running fine. Sorry for the inconvenience. Anyway we can close this topic.


(David Pilato) #10

No problem. Feel free to ask your next questions by opening new questions.
This thread will auto close in some days.

Would be interesting if you could share what was your error here so other readers can benefit from it in the future. Thanks!


(TheRebel) #11

In my case opestack cloud instance was not able to connect to Mater node as data node was not accessible outside linux server. I have changed the data node server configuration file as given below and it is working fine for me.

Data Node yml file

cluster.name: first_cluster
node.name: node1
node.master: false
node.data: true

path.data: /var/fpwork/elasticsearch-6.3.2/data
path.logs: /var/fpwork/elasticsearch-6.3.2/logs
discovery.zen.ping.unicast.hosts: ["10.182.197.XX","10.135.144.XX"]

bootstrap.system_call_filter: false
action.auto_create_index: ".security*,.monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*"
network.host: 0.0.0.0
transport.tcp.compress: true
http.port : 9200
transport.tcp.port: 9300
discovery.zen.minimum_master_nodes: 1
network.publish_host: 0.0.0.0
network.bind_host: 0.0.0.0

(David Pilato) #12

Note that you can simplify a bit the configuration:

cluster.name: first_cluster
node.name: node1
node.master: false

path.data: /var/fpwork/elasticsearch-6.3.2/data
path.logs: /var/fpwork/elasticsearch-6.3.2/logs
discovery.zen.ping.unicast.hosts: ["10.182.197.XX","10.135.144.XX"]

bootstrap.system_call_filter: false
action.auto_create_index: ".security*,.monitoring*,.watches,.triggered_watches,.watcher-history*,.ml*"
network.host: 0.0.0.0
transport.tcp.compress: true
discovery.zen.minimum_master_nodes: 1

Also note that on a small cluster (ie less than 10 nodes), you should keep default values and don't set at all:

node.master: false

Also, for production, you need a 3rd master eligible node. It can be a small one with:

node.data: false

And then set:

discovery.zen.minimum_master_nodes: 2

That will avoid split brain issues.


(TheRebel) #13

Thanks a lot. Will take care from now onward.


(system) closed #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.