Shield.transport.ssl failure

When attempting to enable ssl on the transport, we are getting the following error in the log file (pointing to an untrusted certificate authority) and the nodes will
not communicate with one another.

[2016-03-22 10:48:17,145][WARN ][shield.transport.netty ] [node01] exception caught on transport layer [[id: 0x098be7f4, /192.168.2.103:52609 => /192.168.2.100:9300]],
closing connection javax.net.ssl.SSLException: Received fatal alert: certificate_unknown at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)

I have added the 3 certificates that make up our trust chain into the keystore for shield, and I also created a truststore.jks and put the
3 certificates in that as well. Any ideas what we are missing?

Location of keystore:
/opt/elasticsearch/share/elasticsearch/config/shield/node01.jks

Location of truststore:
/opt/elasticsearch/share/elasticsearch/config/shield/truststore.jks

Shield Contents of elasticsearch.yml file:

shield.audit.enabled: true
shield.ssl.keystore.path: /opt/elasticsearch/share/elasticsearch/config/shield/node01.jks
shield.ssl.keystore.password:
shield.http.ssl: true

shield.transport.ssl: true
shield.ssl.truststore.path: /opt/elasticsearch/share/elasticsearch/config/shield/truststore.jks
shield.ssl.truststore.password:
shield.ssl.hostname_verification.resolve_name: false

We have tried this with and without the resolve.name and still it doesn't work. We also validated our certicate has the IP address as a SAN in the certificate.

Any ideas?

Thanks,

Bob.

Hi Bob,

This definitely seems to be an issue validating the certificate. If you don't mind could you provide the output of keytool -list -keystore file.jks for both the keystore and truststore?

Jay

Here are the contents of the keystores (obviously some data has been modified to not harm any animals or careers...)


bash-4.1$ keytool -list -keystore node01.jks

Keystore type: JKS
Keystore provider: SUN

Your keystore contains 4 entries

policy, Mar 17, 2016, trustedCertEntry,
Certificate fingerprint (SHA1): 2C:E7:THESE ARE NOT THE CERTS YOU ARE LOOKING FOR:46:5B:DB
issuing, Mar 17, 2016, trustedCertEntry,
Certificate fingerprint (SHA1): 96:A0:84:THESE ARE NOT THE CERTS YOU ARE LOOKING FOR::19:68
root, Mar 17, 2016, trustedCertEntry,
Certificate fingerprint (SHA1): F5:59:64:THESE ARE NOT THE CERTS YOU ARE LOOKING FOR:76:8A:5D
node01.mydomain.com, Mar 17, 2016, PrivateKeyEntry,
Certificate fingerprint (SHA1): 72:2B:A4:THESE ARE NOT THE CERTS YOU ARE LOOKING FOR::A3:E1

bash-4.1$ keytool -list -keystore truststore.jks

Keystore type: JKS
Keystore provider: SUN

Your keystore contains 3 entries

policy, Mar 17, 2016, trustedCertEntry,
Certificate fingerprint (SHA1): 2C:E7:THESE ARE NOT THE CERTS YOU ARE LOOKING FOR:46:5B:DB
issuing, Mar 17, 2016, trustedCertEntry,
Certificate fingerprint (SHA1): 96:A0:84:THESE ARE NOT THE CERTS YOU ARE LOOKING FOR::19:68
root, Mar 17, 2016, trustedCertEntry,
Certificate fingerprint (SHA1): F5:59:64:THESE ARE NOT THE CERTS YOU ARE LOOKING FOR:76:8A:5D

Here is a more interesting error in the elasticsearch log file... looks like the client may be presenting its simple hostname instead of a fully qualified hostname....

The subject alternate names in our certs do NOT contain just the hostname, only the fully qualified DNS name and IP address:
DNS Name=node01.com
IP Address=192.168.2.100

How can I get the nodes to present their FULLY qualified DNS names when initiation SLL connections on 9300?

ERROR:

[2016-03-23 09:20:07,231][WARN ][shield.transport.netty ] [node03] exception caught on transport layer [[id: 0xa303628c, /192.168.2.103:39301 :> node01/192.168.2.100:9300]], closing connection
javax.net.ssl.SSLHandshakeException: General SSLEngine problem
at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1431)
at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:535)
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:813)
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:781)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1218)
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:852)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:304)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1509)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979)
at sun.security.ssl.Handshaker$1.run(Handshaker.java:919)
at sun.security.ssl.Handshaker$1.run(Handshaker.java:916)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1369)
at org.jboss.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1392)
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1255)
... 18 more
Caused by: java.security.cert.CertificateException: No subject alternative DNS name matching node01 found.
at sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:204)
at sun.security.util.HostnameChecker.match(HostnameChecker.java:95)
at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455)
at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:436)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:252)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1496)
... 26 more

Thanks for the information. The stores look good.

This indicates that node01 is what is being used to connect to the node and it is not available as a SAN entry or the CN. Can you share the network configuration of your nodes? Do you have a SAN for IP addresses and DNS entries in the cert?

The subject in our certs:
CN=node01.mydomain.com

Subject Alternative Name:
DNS Name=node01.mydomain.com
IP Address=192.168.2.100

Is there a way to get the nodes to present their FULLY qualified DNS names when initiating SSL connections on 9300?

Your network configuration for elasticsearch will come into play for that. That's why I asked for the network configuration from the elasticsearch.yml. I believe if you set network.host or network.publish_host elasticsearch will present the name defined there (you can specify the fully qualified name).

Sorry -

network.host: node01.mydomain.com

Just tried the network.publish_host: node01.mydomain.com and no difference, same error.

I may have to request new certificates, but I don't want to do that without some confidence it will fix the problem.

Lets see if we can avoid that :slight_smile: .

Do you use IP addresses in any aspect of configuration? Maybe for discovery (unicast ping hosts)? Is the short name (node01) registered in DNS or a local hosts file.

discovery.zen.ping.unicast.hosts: ["node01", "node02", "node03"]

DNS has the fully qualified host name and I don't have access to modify the hosts file on the servers. I have spoken with our Certificate team and they are OK with me requesting new certs if that solves the problem.

Can you change those to fully qualified names? That should solve the issue.

The only nodes we have in the ping list are our master or master eligible nodes. Will this solve communication between the non-master nodes as well?

I think using the FQDNs wherever you specify network configuration or addresses for the cluster to communicate to should solve the issue (if it is the same problem).

OK - we have it solved...

  1. We added a Subject Alternative Name of just the hostname NOT fully qualified
  2. We added Client Authentication and Server Authentication to the Enhanced Key Usage in the cert
  3. We limited the Cipher algorithms since our JVM doesn't have the unlimited ciphers (yet)

Thanks for your help.

Bob.

I'm glad to hear you have it solved. One followup for you:

Were you using the default ciphers and saw a log message about this? Are you using OpenJDK from a linux distribution?

If so, it most likely is that your JVM does not enable one of the authentication providers by default. Enabling unlimited ciphers will not resolve this.

Yes, we were using the default ciphers. Yes, we use OpenJDK. Here is the error we were seeing:
[2016-03-25 10:49:36,210][ERROR][shield.ssl ] [node01] unsupported ciphers [[TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA]] were requested but cannot be used in this JVM. If you are trying to use ciphers
with a key length greater than 128 bits on an Oracle JVM, you will need to install the unlimited strength
JCE policy files. Additionally, please ensure the PKCS11 provider is enabled for your JVM.

shield.ssl.ciphers: [ "TLS_RSA_WITH_AES_128_CBC_SHA256", "TLS_RSA_WITH_AES_128_CBC_SHA" ]

This fixes the unsupported Cipher above - but we do not want this long term. We have to update our JDK to support this cipher. Do you have a good article on OpenJDK to fix this?

Thanks,

Bob.

If you are lucky it may be as simple as uncommenting the appropriate line ${java.home}/lib/security/java.security for the SunPKCS11 provider, which may look something like:

#security.provider.10=sun.security.pkcs11.SunPKCS11 ${java.home}/lib/security/nss.cfg