Hi All,
Curly issue experienced after upgrading a cluster to 5.6.4. Ip addresses and hostnames have been sanitized.
2 out of 3 data nodes start OK and communicate back to three dedicated master nodes. If I disable SSL/TLS on the node that the error below occurs on, the node communicates happily with the master nodes (ZEN Unicast). However, when I enable TLS/SSL, the follow error occurs and the data node doesn't join the cluster.
The same CA/PKEY and Certificate is being used on all three nodes, 2 of which work fine, 1 which doesn't. IP Tables rules are consistent across all nodes in the cluster and all other servers are communicating fine. What I'm trying to ascertain is why this particular node is throwing the following error, its cryptic. It's as if elasticsearch isn't picking up the source IP of the system (guessing based on L: 0.0.0.0) reference, target seems OK though. I haven't found anything useful on the master node of interest, but may be over looking something.
[2017-11-22T18:55:57,602][INFO ][o.e.n.Node ] [emd01] starting ...
[2017-11-22T18:55:57,882][INFO ][o.e.t.TransportService ] [emd01] publish_address {172.17.14.155:9300}, bound_addresses {172.17.14.155:9300}
[2017-11-22T18:55:57,894][INFO ][o.e.b.BootstrapChecks ] [emd01] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-11-22T18:55:58,261][WARN ][o.e.x.s.t.n.SecurityNetty4Transport] [emd01] write and flush on the network layer failed (channel: [id: 0xb33386fb, L:0.0.0.0/0.0.0.0:35440 ! R:emn01.testdomain.com/172.17.14.161:9300])
java.nio.channels.ClosedChannelException: null
at io.netty.handler.ssl.SslHandler.channelInactive(...)(Unknown Source) ~[?:?]
[2017-11-22T18:55:58,261][WARN ][o.e.x.s.t.n.SecurityNetty4Transport] [emd01] write and flush on the network layer failed (channel: [id: 0x0c2c884f, L:0.0.0.0/0.0.0.0:38290 ! R:emn01.testdomain.com/172.17.14.162:9300])
java.nio.channels.ClosedChannelException: null
at io.netty.handler.ssl.SslHandler.channelInactive(...)(Unknown Source) ~[?:?]
[2017-11-22T18:55:58,261][WARN ][o.e.x.s.t.n.SecurityNetty4Transport] [emd01] write and flush on the network layer failed (channel: [id: 0x1779c76d, L:0.0.0.0/0.0.0.0:47642 ! R:emn01.testdomain.com/172.17.14.163:9300])
java.nio.channels.ClosedChannelException: null
How would I go about troubleshooting this further? It appears to be SSL/TLS related (enabling that functionality) but all nodes are identical and 2 work. Any assistance would be grand.
Please note, all data nodes in the cluster have been updated to the latest Centos 7 as at 22/11/2017 NZT, all running the same version of Elasticsearch and Java 8.
Thanks,
Andrew