Hey!
I'm working on setting up a cluster and have run into a problem which I hope you could help me with.
The cluster is set up with 2 master nodes (master-1 and master-2) running on the same server, then a data node on another server.
The two master nodes are working fine creating the cluster and selecting a master, but the data node fails to "send join request to master". It logs the name of the selected master so I guess it can connect to the other server?
Have I missed something obvious here or could it still be some network issue? Grateful for any help or pointers in the right direction!
Included the relevant configs and log below:
master-1 config:
cluster.name: cluster1
node.name: master-1
network.host: ["_local_", 145.xxx.yyy.z2]
network.publish_host: 145.xxx.yyy.z2
http.port: 9200
discovery.zen.ping.unicast.hosts: ["145.xxx.yyy.z2:9300", "145.xxx.yyy.z2:9301"]
discovery.zen.minimum_master_nodes: 2
node.data: true
node.master: true
master-2 config:
cluster.name: cluster1
node.name: master-2
network.host: ["_local_", 145.xxx.yyy.z2]
network.publish_host: 145.xxx.yyy.z2
http.port: 9201
discovery.zen.ping.unicast.hosts: ["145.xxx.yyy.z2:9300", "145.xxx.yyy.z2:9301"]
discovery.zen.minimum_master_nodes: 2
node.data: true
node.master: true
data-1 config:
cluster.name: cluster1
node.name: data-1
network.host: ["_local_", 145.zzz.yyy.x6]
network.publish_host: 145.xxx.yyy.x6
http.port: 9200
discovery.zen.ping.unicast.hosts: ["145.xxx.yyy.z2:9300", "145.xxx.yyy.z2:9301"]
discovery.zen.minimum_master_nodes: 2
node.data: true
node.master: false
data-1 log:
[2017-11-16T16:15:52,304][INFO ][o.e.n.Node ] [data-1] initializing ...
[2017-11-16T16:15:52,414][INFO ][o.e.e.NodeEnvironment ] [data-1] using [1] data paths, mounts [[(F:)]], net usable_space [24.8gb], net total_space [24.9gb], types [NTFS]
[2017-11-16T16:15:52,414][INFO ][o.e.e.NodeEnvironment ] [data-1] heap size [990.7mb], compressed ordinary object pointers [true]
[2017-11-16T16:15:52,414][INFO ][o.e.n.Node ] [data-1] node name [data-1], node ID [EiurCDyGRS6IrhmaXI6Pjw]
[2017-11-16T16:15:52,414][INFO ][o.e.n.Node ] [data-1] version[6.0.0], pid[884], build[8f0685b/2017-11-10T18:41:22.859Z], OS[Windows Server 2012 R2/6.3/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_144/25.144-b01]
[2017-11-16T16:15:52,414][INFO ][o.e.n.Node ] [data-1] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -XX:+HeapDumpOnOutOfMemoryError, -Delasticsearch, -Des.path.home=F:\elasticsearch-6.0.0, -Des.path.conf=F:\elasticsearch-6.0.0\config, exit, -Xms1024m, -Xmx1024m, -Xss1024k]
[2017-11-16T16:15:53,429][INFO ][o.e.p.PluginsService ] [data-1] loaded module [aggs-matrix-stats]
...
[2017-11-16T16:15:53,429][INFO ][o.e.p.PluginsService ] [data-1] no plugins loaded
[2017-11-16T16:15:54,961][INFO ][o.e.d.DiscoveryModule ] [data-1] using discovery type [zen]
[2017-11-16T16:15:55,695][INFO ][o.e.n.Node ] [data-1] initialized
[2017-11-16T16:15:55,695][INFO ][o.e.n.Node ] [data-1] starting ...
[2017-11-16T16:15:55,992][INFO ][o.e.t.TransportService ] [data-1] publish_address {145.zzz.yyy.x6:9300}, bound_addresses {127.0.0.1:9300}, {[::1]:9300}, {145.zzz.yyy.x6:9300}
[2017-11-16T16:15:56,007][INFO ][o.e.b.BootstrapChecks ] [data-1] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-11-16T16:16:20,196][INFO ][o.e.d.z.ZenDiscovery ] [data-1] failed to send join request to master [{master-1}{JNyN2w61QOaaPNXgxPc0eQ}{S8Ugwd7kSeS-9xUSL_SiDg}{145.xxx.yyy.z2}{145.xxx.yyy.z2:9300}], reason [RemoteTransportException[[master-1][145.xxx.yyy.z2:9300][internal:discovery/zen/join]]; nested: ConnectTransportException[[data-1][145.zzz.yyy.x6:9300] connect_timeout[30s]]; nested: IOException[Connection timed out: no further information: 145.zzz.yyy.x6/145.zzz.yyy.x6:9300]; nested: IOException[Connection timed out: no further information]; ]
[2017-11-16T16:16:26,055][WARN ][o.e.n.Node ] [data-1] timed out while waiting for initial discovery state - timeout: 30s
[2017-11-16T16:16:26,118][INFO ][o.e.h.n.Netty4HttpServerTransport] [data-1] publish_address {145.zzz.yyy.x6:9200}, bound_addresses {127.0.0.1:9200}, {[::1]:9200}, {145.zzz.yyy.x6:9200}
[2017-11-16T16:16:26,118][INFO ][o.e.n.Node ] [data-1] started
[2017-11-16T16:16:44,243][INFO ][o.e.d.z.ZenDiscovery ] [data-1] failed to send join request to master [{master-1}{JNyN2w61QOaaPNXgxPc0eQ}{S8Ugwd7kSeS-9xUSL_SiDg}{145.xxx.yyy.z2}{145.xxx.yyy.z2:9300}], reason [RemoteTransportException[[master-1][145.xxx.yyy.z2:9300][internal:discovery/zen/join]]; nested: ConnectTransportException[[data-1][145.zzz.yyy.x6:9300] connect_timeout[30s]]; nested: IOException[Connection timed out: no further information: 145.zzz.yyy.x6/145.zzz.yyy.x6:9300]; nested: IOException[Connection timed out: no further information]; ]
Thanks!