Hi,
we're running a Cluster with 3 dedicated Master on AWS EC2 Instances and changed the instance type. Every node got a clean and graceful shutdown and everything worked fine until we changed the elected master node.
=> the cluster was not accessible for a few minutes and at this time it seemed to be failing - the data-nodes were accessible the whole time but the master-nodes weren't.
After some loops the master-nodes regenerated themselves, but it wasn't possible for us to find the reason for the failure. We came the conclusion that the time interval between the changes was too narrow but we are not sure.
Here is an excerpt of the logs:
[2020-05-11T16:37:49,421][WARN ][o.e.t.TcpTransport ] [es-master-41-XXX] exception caught on transport layer [Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/X.X.X.X:39326}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~[netty-codec-4.1.36.Final.jar:4.1.36.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-codec-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1408) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:930) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:682) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:582) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:536) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [netty-transport-4.1.36.Final.jar:4.1.36.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906) [netty-common-4.1.36.Final.jar:4.1.36.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.36.Final.jar:4.1.36.Final]
at java.lang.Thread.run(Thread.java:835) [?:?]
Caused by: javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)
at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:320) ~[?:?]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:263) ~[?:?]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:258) ~[?:?]
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:129) ~[?:?]
at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:681) ~[?:?]
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:636) ~[?:?]
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:454) ~[?:?]
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:433) ~[?:?]
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:634) ~[?:?]
at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:295) ~[netty-handler-4.1.36.Final.jar:4.1.36.Final]
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1332) ~[netty-handler-4.1.36.Final.jar:4.1.36.Final]
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1227) ~[netty-handler-4.1.36.Final.jar:4.1.36.Final]
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1274) ~[netty-handler-4.1.36.Final.jar:4.1.36.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502) ~[netty-codec-4.1.36.Final.jar:4.1.36.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441) ~[netty-codec-4.1.36.Final.jar:4.1.36.Final]
... 16 more
[2020-05-11T16:38:17,530][ERROR][o.e.x.s.a.e.NativeUsersStore] [es-master-41-XXX] security index is unavailable. short circuiting retrieval of user [...]
Do you got any suggestion why?
Best regards