Client SSL Handshake Issues

I am running an Elasticsearch 6.1.1 cluster with 9 nodes on Ubuntu 16.04.3 running JVM 9. One node is a dedicated master, two more are master-eligible and data, and the rest are data nodes.

I have setup SSL by following along the Elastic documentation, and added additional settings to my elasticsearch.yml file that ended up being necessary for transport communication to work.

I generated my own CA using certutil and generated all the certs for each server using certutil.

The transport layer works successfully, and the cluster status when looking at the log file (/var/log/elasticsearch/) is green. Everything is successfully communicating.

However, when I attempt to perform any client-operations (such as _curl https://ES-MASTER-01:9300/_), I receive the following error:

curl: (35) gnutls_handshake() failed: Certificate is bad

If I try to hit port 9200 (curl https://ES-MASTER-01:9200), I receive another error:

curl: (51) SSL: certificate subject name (ES-MASTER-01) does not match target host name 'ES-MASTER-01'

This message is even further confusing given that the two explicitly match one another, at least in this output.

I know that Elasticsearch has a requirement of the certificates allowing both serverAuth and clientAuth, which certutil presumably performs.

Verification of the handshake using an OpenSSL command shows failure:

openssl s_client -connect ES-MASTER-01:9300 | openssl x509 -text -noout

depth=1 CN = MyOrganizationES Global CA
verify return:1
depth=0 CN = ES-MASTER-01
verify return:1
140281558685336:error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate:s3_pkt.c:1487:SSL alert number 42
140281558685336:error:140790E5:SSL routines:ssl23_write:ssl handshake failure:s23_lib.c:177:
Certificate: <removed due to character limits>

I have added the CA to each server's key store as well as JVM's (method outlined below).

The required keys are within the Elastic config directory as specified by the documentation (/etc/elasticsearch/x-pack).

My elasticsearch.yml config file on all nodes is similar to such:

xpack.ssl.key: /etc/elasticsearch/x-pack/ES-MASTER-01.key
xpack.ssl.key_passphrase: <passphrase>
xpack.ssl.certificate: /etc/elasticsearch/x-pack/ES-MASTER-01.crt
xpack.ssl.certificate_authorities: [ "/etc/elasticsearch/x-pack/ca.crt" ]
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.http.ssl.enabled: true
xpack.security.audit.enabled: true
xpack.security.enabled: true

Below is the method in which I used to generate the CA and server keys, and other relevant operations.

# Generate the certificates. This was performed on ES-MASTER-01.
cd /usr/share/elasticsearch
# Generate the CA.
bin/x-pack/certutil ca --ca-dn 'CN=MyOrganizationES Global CA' --pass <pass> --days 3650 --keysize 4096 --out MyOrganizationES_CA.zip --pem

# Copy the ca.key and ca.crt to /usr/share/elasticsearch/ca/

# Generate the instance key pairs. I used a valid YAML file with name, DNS, and IP.
bin/x-pack/certutil cert --ca-cert ca/ca.crt --ca-key ca/ca.key --ca-pass <pass> --days 3650 --in MyOrganizationES_Cluster.yml --keysize 4096 --out MyOrganizationES_Keys.zip --pass <pass> --pem

# Copy the certificate files from the .zip file to corresponding server, as well as the ca.crt.
/etc/elasticsearch/x-pack

# Update the machine's certificate store (all nodes).
# Copy the ca.crt file to this location.
/usr/local/share/ca-certificates
update-ca-certificates --fresh

# Update Java's cacerts store (all nodes).
keytool -import -trustcacerts -cacerts -storepass changeit -noprompt -alias MyOrganizationES -file /usr/local/share/ca-certificates/ca.crt

At this point I believe there to be something with OpenSSL that is not liking the self-signed certificates, but I'm not sure how to further diagnose.

I am also having the issue of Kibana refusing to start with the basic SSL configuration, but I believe that might be a symptom of this larger issue.

Moved to #x-pack

If you really mean that you're trying to access the "transport" port over HTTP, that's never going to work. It's a custom binary protocol, it doesn't understand HTTP.

Can you run the cert file through

openssl x509 -text -noout -in ES-MASTER-01.crt

and post the result.
It sounds like the cert has incorrect SAN fields, which shouldn't happen, but it depends on what's in your MyOrganizationES_Cluster.yml

Hi Tim,

Thank you for the info regarding 9300. I misunderstood it's function and thought it was similar to 80/443 (9200 for http, 9300 for https).

There was actually a mistake in my YAML file. I failed to explicitly specify the host name within the "dns" section and only included the FQDN.

Incorrect:

instances:
 - name: "ES-MASTER-01"
  ip:
  - "10.20.20.20"
  dns:
  - "ES-MASTER-01.domain.com"
 - name: "ES-DATA-01"
  ip:
  - "10.20.20.21"
  dns:
  - "ES-DATA-01.domain.com"
  ...

Correct:

instances:
 - name: "ES-MASTER-01"
  ip:
  - "10.20.20.20"
  dns:
  - "ES-MASTER-01"
  - "ES-MASTER-01.domain.com"
 - name: "ES-DATA-01"
  ip:
  - "10.20.20.21"
  dns:
  - "ES-DATA-01"
  - "ES-DATA-01.domain.com"
  ...

This resolved the handshake issues. Unfortunately when I do curl I receive an authentication error even when I explicitly pass the elastic user password. It might be by design once X-Pack security is enabled, in which case, it's a non-issue.

$ curl https://ES-MASTER-01:9200/_cluster/health?pretty -u elastic:<pass>
 
{
  "error" : {
	"root_cause" : [
	  {
		"type" : "security_exception",
		"reason" : "failed to authenticate user [elastic]",
		"header" : {
		  "WWW-Authenticate" : "Basic realm=\"security\" charset=\"UTF-8\""
		}
	  }
	],
	"type" : "security_exception",
	"reason" : "failed to authenticate user [elastic]",
	"header" : {
	  "WWW-Authenticate" : "Basic realm=\"security\" charset=\"UTF-8\""
	}
  },
  "status" : 401
}

I'm also still having issues with Kibana failing to start at all with SSL enabled, and unfortunately it does not log anything at all to /var/log/kibana/kibana.log. These are the settings I have within kibana.yml

server.host: "10.20.20.20"
server.name: "MyOrganizationKibana"
elasticsearch.url: "https://10.20.20.20:9200"
elasticsearch.username: "kibana"
elasticsearch.password: "<pass>"
 
server.ssl.enabled: true
server.ssl.certificate: /etc/elasticsearch/x-pack/ES-MASTER-01.crt
server.ssl.key: /etc/elasticsearch/x-pack/ES-MASTER-01.key
server.ssl.key_passphrase: <pass>
 
elasticsearch.ssl.certificate: /etc/elasticsearch/x-pack/ES-MASTER-01.crt
elasticsearch.ssl.key: /etc/elasticsearch/x-pack/ES-MASTER-01.key
elasticsearch.ssl.key_passphrase: <pass>
elasticsearch.ssl.certificateAuthorities: [ "/etc/elasticsearch/x-pack/ca.crt" ]
 
logging.dest: /var/log/kibana/kibana.log
xpack.security.encryptionKey: "<key>"
xpack.security.sessionTimeout: 600000

If I disable SSL within Kibana, it does start normally and the authentication page loads with an error which I assume is because it is failing to authenticate with the cluster:

Login is currently disabled. Administrators should consult the Kibana logs for more details.

Kibana stays initialized and writes the the log file. After initializing its plugins, it goes from [green] to [yellow] to [red]. Based on what I understand of the logs, its not able to get a proper handshake which makes sense given that SSL isn't specified.

{"type":"log","@timestamp":"2018-01-12T15:52:50Z","tags":["error","elasticsearch","admin"],"pid":2677,"message":"Request error, retrying\nHEAD https://10.20.20.20:9200/ => unable to verify the first certificate"}

After it fully fails its initialization, It repeats this error message endlessly with no other log entries aside from these:

{"type":"ops","@timestamp":"2018-01-12T15:53:04Z","tags":[],"pid":2677,"os":{"load":[0.18017578125,0.12890625,0.21923828125],"mem":{"total":67559706624,"free":47786881024},"uptime":57987},"proc":{"uptime":24.188,"mem":{"rss":346910720,"heapTotal":303140864,"heapUsed":263092720,"external":6084687},"delay":0.41111599653959274},"load":{"requests":{"5601":{"total":0,"disconnects":0,"statusCodes":{}}},"concurrents":{"5601":6},"responseTimes":{"5601":{"avg":null,"max":0}},"sockets":{"http":{"total":0},"https":{"total":0}}},"message":"memory: 250.9MB uptime: 0:00:24 load: [0.18 0.13 0.22] delay: 0.411"}
{"type":"log","@timestamp":"2018-01-12T15:53:07Z","tags":["warning","elasticsearch","admin"],"pid":2677,"message":"Unable to revive connection: https://10.20.20.20:9200/"}
{"type":"log","@timestamp":"2018-01-12T15:53:07Z","tags":["warning","elasticsearch","admin"],"pid":2677,"message":"No living connections"}
{"type":"log","@timestamp":"2018-01-12T15:53:09Z","tags":["warning","elasticsearch","admin"],"pid":2677,"message":"Unable to revive connection: https://10.20.20.20:9200/"}
{"type":"log","@timestamp":"2018-01-12T15:53:09Z","tags":["warning","elasticsearch","admin"],"pid":2677,"message":"No living connections"}
{"type":"ops","@timestamp":"2018-01-12T15:53:09Z","tags":[],"pid":2677,"os":{"load":[0.16552734375,0.12646484375,0.2177734375],"mem":{"total":67559706624,"free":47786180608},"uptime":57992},"proc":{"uptime":29.188,"mem":{"rss":347451392,"heapTotal":303140864,"heapUsed":263746056,"external":6168207},"delay":0.3634580001235008},"load":{"requests":{"5601":{"total":0,"disconnects":0,"statusCodes":{}}},"concurrents":{"5601":6},"responseTimes":{"5601":{"avg":null,"max":0}},"sockets":{"http":{"total":0},"https":{"total":0}}},"message":"memory: 251.5MB uptime: 0:00:29 load: [0.17 0.13 0.22] delay: 0.363"}
{"type":"log","@timestamp":"2018-01-12T15:53:12Z","tags":["warning","elasticsearch","admin"],"pid":2677,"message":"Unable to revive connection: https://10.20.20.20:9200/"}
{"type":"log","@timestamp":"2018-01-12T15:53:12Z","tags":["warning","elasticsearch","admin"],"pid":2677,"message":"No living connections"}
{"type":"log","@timestamp":"2018-01-12T15:53:14Z","tags":["warning","elasticsearch","admin"],"pid":2677,"message":"Unable to revive connection: https://10.20.20.20:9200/"}
{"type":"log","@timestamp":"2018-01-12T15:53:14Z","tags":["warning","elasticsearch","admin"],"pid":2677,"message":"No living connections"}

The Elasticsearch logs (/var/log/elasticsearch/MyOrganizationES.log) show no events that correspond to these requests either, unfortunately.

Quick update. The curl issue was my own doing. I thought I had already generated our passwords, but I now recall that when we moved to 6.1.1, I started from scratch and created a fresh cluster on newly provisioned VMs. It seems I somehow skipped over password generation. That was a brain fart on my part. I successfully ran bin/x-pack/setup-passwords interactive and am now able to curl the cluster by passing -u elastic:pass without any issue.

I am still troubleshooting the Kibana issue but have not made any headway.

I was able to get Kibana to load with SSL enabled after I copied the certificates to /etc/kibana/ and gave explicit ownership to the kibana user (chown). Unfortunately, I am still receiving the looping error message about "No living connections".

"type":"log","@timestamp":"2018-01-15T17:59:43Z","tags":["license","warning","xpack"],"pid":2669,"message":"License information from the X-Pack plugin could not be obtained from Elasticsearch for the [data] cluster. Error: No Living connections"}
{"type":"ops","@timestamp":"2018-01-15T17:59:44Z","tags":[],"pid":2669,"os":{"load":[0.017578125,0.11376953125,0.20849609375],"mem":{"total":67559706624,"free":46817746944},"uptime":1630},"proc":{"uptime":216.778,"mem":{"rss":164679680,"heapTotal":125018112,"heapUsed":107124456,"external":1686457},"delay":0.18711599987000227},"load":{"requests":{},"concurrents":{"5601":0},"responseTimes":{},"sockets":{"http":{"total":0},"https":{"total":0}}},"message":"memory: 102.2MB uptime: 0:03:37 load: [0.02 0.11 0.21] delay: 0.187"}
{"type":"log","@timestamp":"2018-01-15T17:59:46Z","tags":["warning","elasticsearch","admin"],"pid":2669,"message":"Unable to revive connection: https://10.20.20.20:9200/"}
{"type":"log","@timestamp":"2018-01-15T17:59:46Z","tags":["warning","elasticsearch","admin"],"pid":2669,"message":"No living connections"}

If I curl the cluster using the kibana user, I am able to authenticate properly:

$ curl https://10.20.20.20:9200/_cluster/health?pretty -u kibana:<pass>

{
  "cluster_name" : "MyOrganizationES",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 9,
  "number_of_data_nodes" : 8,
  "active_primary_shards" : 2348,
  "active_shards" : 2392,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

The updated kibana.yml settings are as follows:

server.host: "ES-MASTER-01"
server.name: "MyOrganizationKibana"

elasticsearch.url: "https://10.20.20.20:9200"
elasticsearch.username: "kibana"
elasticsearch.password: "<pass>"

server.ssl.enabled: true
server.ssl.certificate: /etc/kibana/ES-MASTER-01.crt
server.ssl.key: /etc/kibana/ES-MASTER-01.key
server.ssl.keyPassphrase: <pass>
elasticsearch.ssl.certificateAuthorities: [ "/etc/kibana/ES-MASTER-01.crt" ]

logging.dest: /var/log/kibana/kibana.log
logging.verbose: true

xpack.security.enabled: true
xpack.security.encryptionKey: "<pass>"
xpack.security.sessionTimeout: 600000

Any assistance would be greatly appreciated. I'm at a loss here.

In case anyone ever comes across this on a Google, I did find a solution. It was a mistake on my part. For elasticsearch.ssl.certificateAuthorities in kibana.yml, I was invalidly referencing the node's certificate instead of the actual CA cert file.

# Correct:
elasticsearch.ssl.certificateAuthorities: [ "/etc/kibana/ca.crt" ]
# Incorrect:
elasticsearch.ssl.certificateAuthorities: [ "/etc/kibana/ES-MASTER-01.crt" ]

I'm kicking myself over that one, but both Elasticsearch and Kibana are working successfully now with SSL.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.