Node failing to join cluster

Good afternoon-I am looking to add a second node to my cluster and I am not seeing the node join and after reviewing the logs, I am seeing a unknown_ca cert error. (attached below) My ES yml file mimics the original node that was online and has been running just fine for weeks. Node 2 (which I am looking to join to node 1) is also online and running however that error that I found in the log is what is leading me to believe that this is why it is unable to join the cluster. The certs that were created was using the cert-util steps and these same certs were copied from node1 over to node 2 which allowed to get the ES service online. It is just a matter of attempting to get it joined to the cluster.
I am on 7.1 and have provided my configuration(s) and logs below-please note that I have masked the IP for security purposes. I also am not configuring any master\data\coordinating node until I am able to get a cluster setup properly and online.

Logs from the node 02 server-

"stacktrace": ["io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: Received fatal alert: unknown_ca",
"at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]",
"at java.lang.Thread.run(Thread.java:835) [?:?]",
"Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert: unknown_ca",
"at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?]",
"at sun.security.ssl.Alert.createSSLException(Alert.java:117) ~[?:?]",
"at sun.security.ssl.TransportContext.fatal(TransportContext.java:307) ~[?:?]",
"at sun.security.ssl.Alert$AlertConsumer.consume(Alert.java:285) ~[?:?]",
"at sun.security.ssl.TransportContext.dispatch(TransportContext.java:180) ~[?:?]",
"at sun.security.ssl.SSLTransport.decode(SSLTransport.java:164) ~[?:?]",
"at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:681) ~[?:?]",
"at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:636) ~[?:?]",
"at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:454) ~[?:?]",
"at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:433) ~[?:?]",
"at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:634) ~[?:?]",
"at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:295) ~[netty-handler-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1301) ~[netty-handler-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1203) ~[netty-handler-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1247) ~[netty-handler-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]",
"at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]",

Node 02 configuration-(these are the same certs from node01 that allowed me to get the ES service running and the native auth is not turned on for this node as it is for node 01)

# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["node.1.IP", "node.2.IP"]
#
#discovery.zen.ping.unicasts.hosts: ["node.1.IP", "node.2.IP"]
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true

# ---------------------------------- Security ----------------------------------
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: /etc/elasticsearch/config/certs/elastic-stack-ca.p12
xpack.security.transport.ssl.truststore.path: /etc/elasticsearch/config/certs/elastic-stack-ca.p12

#xpack.security.authc.token.enabled: true
xpack.security.enabled: true

#xpack.security.authc.realms.native1:
#  type: native
#   order: 0

Please see our documentation . You should not use the elastic-stack-ca.p12 PKCS#12 as the keystore to set in all nodes as it contains the CA certificate and key. Instead, you should use that CA certificate and key with elasticsearch-certutil to generate a certificate and private key for each node in your cluster.

Let us know if you have additional questions after reading the documentation and the instructions I shared above.

Thanks @ikakavas!! I was able to make that change and everything now works properly. I was able to get the second node added with no issues. Appreciate the feedback.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.