Which version of ECK are you running?
Can you provide the entire Elasticsearch resource manifest? Which version of Elasticsearch are you running?
Can you provide the full logs from the master nodes?
You should probably not set MINIMUM_MASTER_NODES yourself, ECK already handles it "the right way" (which covers some corner cases).
W0903 10:56:45.000766 1 [elasticsearch-es-masters-zone-a-1] failed to connect to node {elasticsearch-es-data-all-zone-a-21}{yWnQkfY7TkqAUIGTZb2LdQ}{KzuCWQvPRK2HGcUYQVmgiw}{10.4.191.2}{10.4.191.2:9300}{ml.machine_memory=63328395264, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true, group=all} (tried [1] times) org.elasticsearch.transport.ConnectTransportException: [elasticsearch-es-data-all-zone-a-21][10.4.191.2:9300] general node connection failure
at org.elasticsearch.transport.ConnectionManager.connectToNode(ConnectionManager.java:127) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:342) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:329) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.cluster.NodeConnectionsService.validateAndConnectIfNeeded(NodeConnectionsService.java:154) [elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.cluster.NodeConnectionsService$1.doRun(NodeConnectionsService.java:107) [elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) [elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.7.1.jar:6.7.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:835) [?:?]
Caused by: java.lang.IllegalStateException: handshake failed with {elasticsearch-es-data-all-zone-a-21}{yWnQkfY7TkqAUIGTZb2LdQ}{KzuCWQvPRK2HGcUYQVmgiw}{10.4.191.2}{10.4.191.2:9300}{ml.machine_memory=63328395264, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true, group=all}
at org.elasticsearch.transport.TransportService.handshake(TransportService.java:417) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.TransportService.lambda$connectionValidator$4(TransportService.java:348) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.ConnectionManager.connectToNode(ConnectionManager.java:105) ~[elasticsearch-6.7.1.jar:6.7.1]
... 9 more
Caused by: org.elasticsearch.transport.RemoteTransportException: [elasticsearch-es-data-all-zone-a-21][10.4.191.2:9300][internal:transport/handshake]
Caused by: org.elasticsearch.ElasticsearchSecurityException: missing authentication token for action [internal:transport/handshake]
at org.elasticsearch.xpack.core.security.support.Exceptions.authenticationError(Exceptions.java:18) ~[?:?]
at org.elasticsearch.xpack.core.security.authc.DefaultAuthenticationFailureHandler.createAuthenticationError(DefaultAuthenticationFailureHandler.java:163) ~[?:?]
at org.elasticsearch.xpack.core.security.authc.DefaultAuthenticationFailureHandler.missingToken(DefaultAuthenticationFailureHandler.java:118) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$AuditableTransportRequest.anonymousAccessDenied(AuthenticationService.java:650) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$handleNullToken$19(AuthenticationService.java:466) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.handleNullToken(AuthenticationService.java:471) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.consumeToken(AuthenticationService.java:355) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$extractToken$9(AuthenticationService.java:326) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.extractToken(AuthenticationService.java:344) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$checkForApiKey$3(AuthenticationService.java:287) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.xpack.security.authc.ApiKeyService.authenticateWithApiKeyIfPresent(ApiKeyService.java:345) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.checkForApiKey(AuthenticationService.java:268) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$authenticateAsync$0(AuthenticationService.java:251) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.xpack.security.authc.TokenService.getAndValidateToken(TokenService.java:326) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$authenticateAsync$2(AuthenticationService.java:247) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$lookForExistingAuthentication$6(AuthenticationService.java:305) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lookForExistingAuthentication(AuthenticationService.java:316) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.authenticateAsync(AuthenticationService.java:243) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.access$000(AuthenticationService.java:195) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:138) ~[?:?]
at org.elasticsearch.xpack.security.transport.ServerTransportFilter$NodeProfile.inbound(ServerTransportFilter.java:133) ~[?:?]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:306) ~[?:?]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1087) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:192) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1046) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:932) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:763) ~[elasticsearch-6.7.1.jar:6.7.1]
at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:53) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1436) ~[?:?]
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1203) ~[?:?]
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1247) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441) ~[?:?]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) ~[?:?]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) ~[?:?]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) ~[?:?]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) ~[?:?]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) ~[?:?]
at java.lang.Thread.run(Thread.java:835) ~[?:?]
Is it possible some of your nodes are configured with xpack security disabled? Or did you disable it then re-enable it at some point?
Can you try manually deleting the failing master Pods (they should be recreated automatically - with the right config specified in the ES spec)?
I see you are mounting your own jvm.options. Can you share its content?
I shall reinstall the elasticsearch from scratch with default config of xpack security, as it is development environment and verify it again.
When I installed again, I saw logs like
2020-09-04T14:41:27.158Z INFO transport No tls certificate found in secret {"service.version": "1.2.1-b5316231", "namespace": "es-test", "pod_name": "elasticsearch-test-es-data-zone-a-9"}
I re-installed the elasticsearch with all default settings of xpack and logs from master when I upgrade some settings.
caught exception while handling client http traffic, closing connection [id: 0xd768d6bb, L:0.0.0.0/0.0.0.0:9200 ! R:/10.4.109.1:37772] io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 474554202f20485454502f312e310d0a486f73743a2031302e342e3130392e31393a393230300d0a557365722d4167656e743a2044617461646f67204167656e742f372e32322e300d0a4163636570742d456e636f64696e673a20677a69702c206465666c6174650d0a4163636570743a202a2f2a0d0a436f6e6e656374696f6e3a206b6565702d616c6976650d0a0d0a|
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]|
at java.lang.Thread.run(Thread.java:835) [?:?]|
Caused by: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 474554202f20485454502f312e310d0a486f73743a2031302e342e3130392e31393a393230300d0a557365722d4167656e743a2044617461646f67204167656e742f372e32322e300d0a4163636570742d456e636f64696e673a20677a69702c206465666c6174650d0a4163636570743a202a2f2a0d0a436f6e6e656374696f6e3a206b6565702d616c6976650d0a0d0a|
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1182) ~[netty-handler-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1247) ~[netty-handler-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]|
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]|
... 15 more|
No tls certificate found in secret in the operator logs is not a problem, it's just a regular log when certificates are created for the first time.
The last log looks like something is trying to reach Elasticsearch using HTTP instead of HTTPS.
Sorry but I'm a bit confused with your setup. Especially if you modified xpack security settings.
How about you just create an Elasticsearch cluster by following the quickstart and we work from there: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-quickstart.html? Are you able to spin it up? Is a rolling upgrade successful if you just change one config setting (eg. node.attr.foo: bar)?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.