Nodes don't see each other on Google Container Engine

Hello,

I'm trying to run a 3 node cluster on Google Container Engine (GKE). I'm using Docker to run the images directly, not Kubernetes. The image is based on the one provided by Google Cloud Launcher. Each node works correctly, but they are unable to see each other to form a cluster.

The command I run is docker run -p 9200-9400:9200-9400 image_name. The firewall in Google is configured to allow traffic between ports 9100-9400.

Current config file is

node.name: ${HOSTNAME}
network.host: _gce:hostname_
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["10.240.0.10:9300", "10.240.0.11:9300", "10.240.0.12:9300"]
network.bind_host: localhost
http.host: 0.0.0.0
readonlyrest:
  enable: false

I also tried using the discovery with the GCE Plugin, setting transport.host to the host IP instead of localhost (but that would result in the following error):

org.elasticsearch.bootstrap.StartupException: BindTransportException[Failed to bind to [9300-9400]]; nested: BindEx
ception[Cannot assign requested address];

I tried all combinations of GCE provided hostnames, external and internal IPs for discovery and not setting the transport and host parameters, but that would result in a crash. I guess the problem lies somewhere with the transport.host parameter, but I've exhausted all options and copied all possible configuration I could find online.

Output of running with the current elasticsearch.ymlfile is

[2017-08-18T12:37:20,099][INFO ][o.e.n.Node               ] [daa04be6c3b3] initializing ...
[2017-08-18T12:37:20,314][INFO ][o.e.e.NodeEnvironment    ] [daa04be6c3b3] using [1] data paths, mounts [[/usr/shar
e/elasticsearch/data (/dev/sda1)]], net usable_space [23.2gb], net total_space [27.3gb], spins? [possibly], types [
ext4]
[2017-08-18T12:37:20,316][INFO ][o.e.e.NodeEnvironment    ] [daa04be6c3b3] heap size [6.3gb], compressed ordinary o
bject pointers [true]
[2017-08-18T12:37:20,318][INFO ][o.e.n.Node               ] [daa04be6c3b3] node name [daa04be6c3b3], node ID [Zrndi
Tk0SEyi6TSrTwI_pA]
[2017-08-18T12:37:20,319][INFO ][o.e.n.Node               ] [daa04be6c3b3] version[5.4.3], pid[1], build[eed30a8/20
17-06-22T00:34:03.743Z], OS[Linux/4.4.52+/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_131/25.131-
b11]
[2017-08-18T12:37:20,320][INFO ][o.e.n.Node               ] [daa04be6c3b3] JVM arguments [-Xms6500m, -Xmx6500m, -XX
:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+DisableExplic
itGC, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.perm
issionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.m
axCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+H
eapDumpOnOutOfMemoryError, -Des.path.home=/usr/share/elasticsearch]
[2017-08-18T12:37:22,484][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [aggs-matrix-stats]
[2017-08-18T12:37:22,484][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [ingest-common]
[2017-08-18T12:37:22,485][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [lang-expression]
[2017-08-18T12:37:22,485][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [lang-groovy]
[2017-08-18T12:37:22,485][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [lang-mustache]
[2017-08-18T12:37:22,485][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [lang-painless]
[2017-08-18T12:37:22,485][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [percolator]
[2017-08-18T12:37:22,485][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [reindex]
[2017-08-18T12:37:22,486][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [transport-netty3]
[2017-08-18T12:37:22,486][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded module [transport-netty4]
[2017-08-18T12:37:22,487][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded plugin [discovery-gce]
[2017-08-18T12:37:22,487][INFO ][o.e.p.PluginsService     ] [daa04be6c3b3] loaded plugin [readonlyrest]
[2017-08-18T12:37:25,979][INFO ][o.e.d.DiscoveryModule    ] [daa04be6c3b3] using discovery type [zen]
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
[2017-08-18T12:37:27,051][INFO ][o.e.p.r.e.IndexLevelActionFilter] Configuration reloaded - ReadonlyREST disabled
[2017-08-18T12:37:27,071][INFO ][o.e.p.r.e.IndexLevelActionFilter] Readonly REST plugin was loaded...
[2017-08-18T12:37:27,449][INFO ][o.e.n.Node               ] [daa04be6c3b3] initialized
[2017-08-18T12:37:27,450][INFO ][o.e.n.Node               ] [daa04be6c3b3] starting ...
[2017-08-18T12:37:27,937][INFO ][o.e.t.TransportService   ] [daa04be6c3b3] publish_address {10.240.0.12:9300}, bound_addresses {127.0.0.1:9300}, {[::1]:9300}
[2017-08-18T12:37:27,948][INFO ][o.e.b.BootstrapChecks    ] [daa04be6c3b3] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-08-18T12:37:31,007][WARN ][o.e.d.z.ZenDiscovery     ] [daa04be6c3b3] not enough master nodes discovered during pinging (found [[Candidate{node={daa04be6c3b3}{ZrndiTk0SEyi6TSrTwI_pA}{WXcRmBmxSIOSkGl_xLLNcQ}{gke-es5-cluster-de
fault-pool-9896038c-bp3v.c.project.internal}{10.240.0.12:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again

The not enough master nodes message then loops infinitely.

A curl from one machine to another returns the standard You Know, for Search message for port 9200 and url: (56) Recv failure: Connection reset by peerfor port 9300.

Any help will be greatly appreciated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.