Hello, so I'm trying to configure ES on AWS EB service. I have two nodes and one of the nodes fails with the following message
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":"waited for [30s]"}],"type":"master_not_discovered_exception","reason":"waited for [30s]"},"status":503}
When I looked at the error logs I saw
[2015-11-14 09:23:39,501][WARN ][transport.netty ] [718e01a158f7] exception caught on transport layer [[id: 0x801cfa11, /10.170.122.175:44259 :> /172.17.0.5:9300]], closing connection
java.io.StreamCorruptedException: invalid internal transport message format, got (ff,f4,ff,fd)
at org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:64)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:482)
.......
From the code in SizeHeaderFrameDecoder.java you can see that ES expects first two bytes to be equal to ES, however from tcpdump I can see something different
10.151.143.59.39581 > 10.170.122.175.9300: Flags [P.], cksum 0x1ff5 (incorrect -> 0xefbc), seq 1:164, ack 1, win 229, options [nop,nop,TS val 1296347 ecr 1292999], length 163
0x0000: 4500 00d7 d46e 4000 3f06 4787 0a97 8f3b E....n@.?.G....;
0x0010: 0aaa 7aaf 9a9d 2454 4f01 13b7 7d46 d95c ..z...$TO...}F.\
0x0020: 8018 00e5 1ff5 0000 0101 080a 0013 c7db ................
0x0030: 0013 bac7 4553 0000 009d 0000 0000 0000 ....ES..........
0x0040: 0183 0000 1e84 e31e 696e 7465 726e 616c ........internal
0x0050: 3a64 6973 636f 7665 7279 2f7a 656e 2f75 :discovery/zen/u
0x0060: 6e69 6361 7374 0000 0000 4100 0000 00b2 nicast....A.....
0x0070: d05e 000a 6573 2d73 7461 6769 6e67 0c30 .^..es-staging.0
0x0080: 6534 6535 6461 6461 3463 6216 5436 4f31 e4e5dada4cb.T6O1
0x0090: 6368 5932 5233 6938 3452 665f 6178 4236 chY2R3i84Rf_axB6
0x00a0: 4f51 0d32 3535 2e32 3535 2e32 3535 2e30 OQ.255.255.255.0
0x00b0: 0d32 3535 2e32 3535 2e32 3535 2e30 0001 .255.255.255.0..
0x00c0: 04ff ffff 0000 0024 5400 e389 7a00 0000 .......$T...z...
As you can see first two bytes are 0x4500 E. and not ES
Both nodes started from Docker container, have equal environment and cluster name is es-staging
How do I fix this problem?
Thanks