Dear all,
Today I'm once again dealing with certificates!
Each time I'm touching those, there always an issue!
I started by replacing one certificate on an ingest node (ingest-prod01.domain.com). When I restart the elasticsearch service, the node is no longer joining the cluster and on the ES logs there a tons of very long warning messages.
Here is the one from the ingest node:
[2023-10-25T18:01:56,924][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [ingest-prod01.domain.com] client did not trust this server's certificate, closing connection Netty4TcpChannel{localAddress=/10.32.14.108:9300, remoteAddress=/10.32.14.106:48256, profile=default}
[2023-10-25T18:01:56,933][INFO ][o.e.c.c.JoinHelper ] [ingest-prod01.domain.com] failed to join {elastic-master-prod03.domain.com}{sZLZdRYkRCy-5iQyXQUlRw}{_B1WgHkZR9GkFkZj8-3l1Q}{elastic-master-prod03.domain.com}{10.32.14.106}{10.32.14.106:9300}{m}{xpack.installed=true} with JoinRequest{sourceNode={ingest-prod01.domain.com}{tzX0ezvGTkyRFOJfa6KnWQ}{5hV3gce9TvahNpf5TgsJMQ}{ingest-prod01.domain.com}{10.32.14.108}{10.32.14.108:9300}{ir}{xpack.installed=true}, minimumTerm=77, optionalJoin=Optional[Join{term=77, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={ingest-prod01.domain.com}{tzX0ezvGTkyRFOJfa6KnWQ}{5hV3gce9TvahNpf5TgsJMQ}{ingest-prod01.domain.com}{10.32.14.108}{10.32.14.108:9300}{ir}{xpack.installed=true}, targetNode={elastic-master-prod03.domain.com}{sZLZdRYkRCy-5iQyXQUlRw}{_B1WgHkZR9GkFkZj8-3l1Q}{elastic-master-prod03.domain.com}{10.32.14.106}{10.32.14.106:9300}{m}{xpack.installed=true}}]}
The IP 10.32.14.108 is my ingest node that I've just update its certificate
The IP 10.32.14.106 is the current active master on my cluster
Going on the active master server, I got those messages:
[2023-10-25T18:31:12,695][WARN ][o.e.c.s.DiagnosticTrustManager] [elastic-master-prod03.domain.com] failed to establish trust with server at [10.32.14.108]; the server provided a certificate with subject name [CN=ingest-prod01.domain.com,OU=STI,O=Company,L=City,ST=State,C=CA], fingerprint [4eda725bd07a95889f125b26ba47c8f359c79e5a], keyUsage [digitalSignature, keyEncipherment, dataEncipherment] and extendedKeyUsage [clientAuth, serverAuth]; the certificate is valid between [2023-10-24T18:05:19Z] and [2025-10-23T18:05:19Z] (current time is [2023-10-25T22:31:12.695239702Z], certificate dates are valid); the session uses cipher suite [TLS_AES_256_GCM_SHA384] and protocol [TLSv1.3]; the certificate's subject alternative names cannot be parsed; the certificate is issued by [CN=PKIS01-CA,DC=domain,DC=ca] but the server did not provide a copy of the issuing certificate in the certificate chain; the issuing certificate with fingerprint [6007900a5e078378e4b2443b7a4d11c35d690e8f] is trusted in this ssl context ([xpack.security.transport.ssl (with trust configuration: PEM-trust{/etc/elasticsearch/certs/ROOT-CA-Base64.crt,/etc/elasticsearch/certs/SUB-CA-01-Base64.crt,/etc/elasticsearch/certs/SUB-CA-01-2022-Base64.crt})])
[2023-10-25T18:31:12,697][WARN ][o.e.t.TcpTransport ] [elastic-master-prod03.domain.com] exception caught on transport layer [Netty4TcpChannel{localAddress=/10.32.14.106:52394, remoteAddress=10.32.14.108/10.32.14.108:9300, profile=default}], closing connection
Note: I changed some information from those message for security reasons.
Once again the ingest node tries to connect to the master node and it failed.
From those logs, we can see this:
- "the certificate's subject alternative names cannot be parsed"
- "the certificate is issued by [CN=PKIS01-CA,DC=domain,DC=ca] but the server did not provide a copy of the issuing certificate in the certificate chain"
When using openssl
on the new certificate file, I can easily see the alternatives names:
X509v3 Subject Alternative Name:
DNS:ingest-prod01.domain.com, DNS:localhost, IP Address:127.0.0.1, IP Address:10.32.14.108
The old certificate, which hasn't expired yet, has the same values there. If I put back that certificate, the ingest node will starts and join the ELK cluster as expected.
I'm also mixed up; who is not trusting who in this? The ingest node ou the master node?
I'm sorry telling you that, but certificates will always be a nightmare for me!
Is someone has an idea what's may be wrong?
Regards,
Yanick