Elasticsearch TLS with Wildcard SAN

Does X-Pack have issues with validating wildcard subject alternative dns names?

OS: Windows Server 2012 R2 Datacenter
Elasticsearch Version: 6.3.0 (zip)

I have a PFX certificate, myCert.pfx, enabled for both client and server authentication.
The certificate has the following SAN
DNS Name=*.mycluster.mycompany.com

I also have the public key of the certificate authority this certificate was generated from ca.cer

In elasticsearch.yml I have the following relevant xpack settings on two hosts (es1.mycluster.mycompany.com and es2.mycluster.mycompany.com):
discovery.zen.ping.unicast.hosts: [ "es1.mycluster.mycompany.com", "es2.mycluster.mycompany.com" ]
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
xpack.ssl.verification_mode: full
xpack.ssl.keystore.path: certs/myCert.pfx
xpack.ssl.keystore.type: PKCS12
xpack.security.transport.ssl.certificate_authorities: [ "certs/ca.cer" ]

I'm in a Windows environment so I have added these two IP addresses in the hosts file so they resolve to the right dns names:
192.168.0.1 es1.mycluster.mycompany.com
192.168.0.2 es2.mycluster.mycompany.com

When I start these nodes, I will get handshake errors that look like this

[2018-07-13T07:44:37,237][WARN ][o.e.x.s.t.n.SecurityNetty4ServerTransport] [node-2] client did not trust this server's certificate, closing connection NettyTcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/192.168.0.2:51719}
[2018-07-13T07:44:37,284][WARN ][o.e.x.s.t.n.SecurityNetty4ServerTransport] [node-2] exception caught on transport layer [NettyTcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/192.168.0.2:51722}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLException: Received close_notify during handshake
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:459) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) ~[netty-codec-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:545) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) [netty-transport-4.1.16.Final.jar:4.1.16.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.16.Final.jar:4.1.16.Final]
	at java.lang.Thread.run(Thread.java:844) [?:?]
Caused by: javax.net.ssl.SSLException: Received close_notify during handshake
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:214) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1762) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1725) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.recvAlert(SSLEngineImpl.java:1878) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.processInputRecord(SSLEngineImpl.java:1140) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:1020) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:902) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:680) ~[?:?]
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626) ~[?:?]
	at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:281) ~[?:?]
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1215) ~[?:?]
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1127) ~[?:?]
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1162) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[?:?]
	... 15 more

If I set xpack.ssl.verification_mode to certificate I do not get these errors and the cluster runs fine, so I know it is not a problem of certificate chain.

Update: In looking at the other node's log file it seems to be looking for the IP address in the SAN. Why would that be happening? According to the Elasticsearch TLS/SSL Settings , full verification means:

verifies that the provided certificate is signed by a trusted authority (CA)
and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate.

It specifically says "OR" not "AND"

Can Elasticsearch or the JVM somehow distinguish between a hosts file entry and a DNS lookup?

[2018-07-13T07:44:37,206][WARN ][o.e.x.s.t.n.SecurityNetty4ServerTransport] [node-1] exception caught on transport layer [NettyTcpChannel{localAddress=0.0.0.0/0.0.0.0:51715, remoteAddress=192.168.0.1/192.168.0.1:9300}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:459)...
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
	at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1602) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:497) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:745) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:680) ~[?:?]
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626) ~[?:?]
	at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:281) ~[?:?]
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1215) ~[?:?]
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1127) ~[?:?]
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1162) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[?:?]
	... 15 more
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:198) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1830) ~[?:?]
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1735) ~[?:?]
	at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:347) ~[?:?]
	at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:339) ~[?:?]
	at sun.security.ssl.ClientHandshaker.checkServerCerts(ClientHandshaker.java:1968) ~[?:?]
	at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1777) ~[?:?]
	at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:264) ~[?:?]
	at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1098) ~[?:?]
	at sun.security.ssl.Handshaker$1.run(Handshaker.java:1031) ~[?:?]
	at sun.security.ssl.Handshaker$1.run(Handshaker.java:1028) ~[?:?]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
	at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1540) ~[?:?]
	at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1364) ~[?:?]
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1272) ~[?:?]
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1127) ~[?:?]
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1162) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[?:?]
	... 15 more
Caused by: java.security.cert.CertificateException: No subject alternative names matching IP address 192.168.101.97 found
	at sun.security.util.HostnameChecker.matchIP(HostnameChecker.java:182) ~[?:?]
	at sun.security.util.HostnameChecker.match(HostnameChecker.java:98) ~[?:?]
	at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:481) ~[?:?]
	at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:456) ~[?:?]
	at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:296) ~[?:?]
	at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:145) ~[?:?]
	at org.elasticsearch.xpack.core.ssl.SSLService$ReloadableTrustManager.checkServerTrusted(SSLService.java:594) ~[?:?]
	at sun.security.ssl.ClientHandshaker.checkServerCerts(ClientHandshaker.java:1952) ~[?:?]
	at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1777) ~[?:?]
	at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:264) ~[?:?]
	at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1098) ~[?:?]
	at sun.security.ssl.Handshaker$1.run(Handshaker.java:1031) ~[?:?]
	at sun.security.ssl.Handshaker$1.run(Handshaker.java:1028) ~[?:?]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
	at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1540) ~[?:?]
	at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1364) ~[?:?]
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1272) ~[?:?]
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1127) ~[?:?]
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1162) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489) ~[?:?]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428) ~[?:?]
	... 15 more

Hey @Micah_Hunsberger
Did you find any solution for this, I am also having same issue, my data nodes are working and formed cluster but co-ordinator is not able to join and error is same as above.

Thanks in advance

Did you find any solution for this

No, sorry. I've had to temporarily set verification mode to certificate. If I find a solution or an answer I'll be sure to post it here.

I fixed it.

For the Transport layer, the host is always resolved to an IP address.

For full verification mode to work, which also performs hostname verification, the certificate would therefore have to have an iPAddress Subject Alternative Name that matches the resolved IP address to work for the Transport layer.

Thanks for the knowledge, @forloop. Do you think this could be more clearly noted in the documentation for xpack.security.transport.ssl.verification_mode? Currently it says to

See xpack.ssl.verification_mode for a description of these values

Which specifically says or IP Address, which is the language that was confusing me for a long time.

There is only one place I found that indicates you need both dns and ip address included in the certificate, but it is lumped with the instructions for generating node certificates:

Additionally, it is recommended that the certificates contain subject alternative names (SAN) that correspond to the node’s IP address and DNS name so that hostname verification can be performed.

If you already generated certificates outside of the certutil, this could be easy to miss.

Thanks

Thank you for the feedback @Micah_Hunsberger, I've opened an issue to discuss.

1 Like

@Micah_Hunsberger I may have missed this, but have you set network.host or publish.host to the DNS name on your nodes?

@jaymode no, they are set to the ip address of the host. But that shouldn't affect what the client thinks it is connecting to, should it?

It does change how the client connect. Internally a node will use the value of the publish address to connect to another node, so if it is not set to the name in the certificate then the client connects using the IP address, which is not in the certificate. There is no reverse DNS lookup performed on the client end of a connection to resolve the IP to a DNS name so the publish address needs to be set to the DNS value if the certificates only have a wildcard DNS SAN.

I guess I don't understand how the client would know what the server's publish address is, I assumed the client would open a tcp connection to the hostname given in disovery.zen.ping.unicast.hosts and then validate the certificate based on that hostname. Is there a step in between where the server can tell the client what its publish address is?

During pinging the hostname given in the setting is used to connect. The response in pinging includes the node information from the node that was connected to; this is where the publish host comes into play. The node that receives that information will then use that information to open more connections and if the publish address is the ip then the ip will be used.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.