Elasticsearch Cluster formation on AWS - Elasticsearch 5.2

Hi,

I recently upgraded Elasticsearch from 1.6 to 5.2, and my ES instances stopped forming a cluster on aws environment.

I had set below parameters in configuration file:
cluster.name: elasticsearch_ids
discovery.zen.hosts_provider: ec2
cloud.aws.region: ap-southeast-2
discovery.zen.ping_timeout: 30s

When the instances were stared, they output logs as below:
[2017-02-27T00:02:44,437][INFO ][o.e.n.Node ] initialized
[2017-02-27T00:02:44,438][INFO ][o.e.n.Node ] [Eq-9SWO] starting ...
[2017-02-27T00:02:44,751][WARN ][i.n.u.i.MacAddressUtil ] Failed to find a usable hardware address from the network interfaces; using random bytes: a0:4d:7f:b6:2d:0d:db:70
[2017-02-27T00:02:44,832][INFO ][o.e.t.TransportService ] [Eq-9SWO] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[2017-02-27T00:02:44,845][WARN ][o.e.b.BootstrapChecks ] [Eq-9SWO] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2017-02-27T00:02:54,501][INFO ][o.e.x.m.e.Exporters ] [Eq-9SWO] skipping exporter [default_local] as it is not ready yet
[2017-02-27T00:03:04,504][INFO ][o.e.x.m.e.Exporters ] [Eq-9SWO] skipping exporter [default_local] as it is not ready yet
[2017-02-27T00:03:14,508][INFO ][o.e.x.m.e.Exporters ] [Eq-9SWO] skipping exporter [default_local] as it is not ready yet
[2017-02-27T00:03:14,878][WARN ][o.e.n.Node ] [Eq-9SWO] timed out while waiting for initial discovery state - timeout: 30s
[2017-02-27T00:03:14,886][INFO ][o.e.h.HttpServer ] [Eq-9SWO] publish_address {172.31.14.153:9200}, bound_addresses {[::]:9200}
[2017-02-27T00:03:14,886][INFO ][o.e.n.Node ] [Eq-9SWO] started
[2017-02-27T00:03:15,530][INFO ][o.e.c.s.ClusterService ] [Eq-9SWO] new_master {Eq-9SWO}{Eq-9SWOaTW6O7qj81S6V4w}{Nw1I_hvMS4SfSWwfS2H-ew}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-02-27T00:03:15,618][INFO ][o.e.g.GatewayService ] [Eq-9SWO] recovered [0] indices into cluster_state

The security group and IAM role shouldn't be a problem here because it was all working for Elasticsearch 1.6. Can anyone please help?

Thank you.

5.X only listens to loopback by default, see https://www.elastic.co/guide/en/elasticsearch/reference/5.2/bootstrap-checks.html

Thanks warkolm, indeed I set http.host to 0.0.0.0 and transport.host to 127.0.0.1. Could you please also advise what value I should use for these two parameters? Do I have to set to their own private IP address?
Thanks.

You cluster will never form with that.

Okay I see, so how should I set the value for transport.host? I tried remove it from setting but I got some strange errors like below:

[2017-02-27T02:27:29,445][WARN ][o.e.t.n.Netty4Transport ] [ZoX1MdS] exception caught on transport layer [[id: 0x9547699c, L:/172.31.13.25:43242 - R:/172.31.13.77:9300]], closing connection
java.io.EOFException: tried to read: 100 bytes but only 76 remaining
at org.elasticsearch.transport.netty4.ByteBufStreamInput.ensureCanReadBytes(ByteBufStreamInput.java:75) ~[?:?]
at org.elasticsearch.common.io.stream.FilterStreamInput.ensureCanReadBytes(FilterStreamInput.java:80) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.common.io.stream.StreamInput.readArraySize(StreamInput.java:925) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.common.io.stream.StreamInput.readString(StreamInput.java:342) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.common.io.stream.StreamInput.readList(StreamInput.java:885) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.common.io.stream.StreamInput.readMapOfLists(StreamInput.java:479) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ThreadContextStruct.(ThreadContext.java:335) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ThreadContextStruct.(ThreadContext.java:322) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.common.util.concurrent.ThreadContext.readHeaders(ThreadContext.java:184) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1327) ~[elasticsearch-5.2.1.jar:5.2.1]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) ~[transport-netty4-5.2.1.jar:5.2.1]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:280) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:396) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:280) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:396) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.7.Final.jar:4.1.7.Final]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_92-internal]

If you are running on AWS, you probably want to use the EC2 discovery plugin — see the docs: https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-ec2.html
I'd say that this is the preferred method.

Alternatively you can bind the transport.host (that's the one used for the cluster communication) to the publicly available interface and provide a seed list of the other hosts. But make sure to properly protect your cluster via security groups.