Updating xpack certificates to a new CA in an active Elasticsearch cluster without downtime

Hi,

We have an active Elasticsearch cluster (large) with TLS/SSL enabled for inter-node communication (xpack.security.transport.ssl). We are planning to replace the existing node certificates with new ones issued by a different CA.

We want to make sure this process is safe and minimally disruptive. Our questions are:

  1. Can nodes with certificates from a new issuer join and operate in the cluster alongside nodes with the old certificates during the rotation?

  2. What is the recommended procedure for rolling out new certificates in a production cluster without causing downtime or cluster instability?

  3. Are there common pitfalls or best practices for this type of certificate rotation?

Any guidance, references, or examples from people who have performed a similar upgrade in production would be greatly appreciated.

Thanks in advance!

interesting question. I always suggest to just do it on a test cluster with same versions, using say VMs, to establish and validate the upgrade process.

It’d help know the Elasticsearch version, and see the full elasticsearch.yml - some oddities could be hidden in plain sight there that might be significant.

But in general, this should be doable without downtime, but will require rolling restarts (likely 2x). Key would be to establish trust for the new CA (in addition to existing CA) everywhere first, and only then rotate to new certificates. At least that's what I'd do. If you have done version upgrades before, it should be similar in terms of "disruption".

I would recommend that you check this documentation: Update TLS certificates | Elastic Docs

Changing the CA is a little more complicated, I would also recommend that you allocate a maintenance window where you may have some downtime.

And as mentioned, execute this process on a test cluster to validate all steps.

Any change carries risk, but it should be possible to do this without downtime. It will take several steps tho:

  1. Add the new CA to the transport TLS trust store on every node (so both CAs are trusted)
  2. Replace the transport certificate on every node
  3. Remove the old CA from the transport TLS trust store on every node.

You need to have an intermediate state when nodes trust certificates issued by both CAs so that you can update each node’s certificates one by one.

If you don’t do it right, it will fail fairly noisily on at least one node, so keep a close eye on every node’s logs. Such a failure should only affect one node and, if so, you should be able to revert the last step and fix whatever went wrong.

Thank you @RainTown, @leandrojmp and @DavidTurner for looking into this and sharing the suggestions.

We will try the suggestion shared by @DavidTurner and share whether it works without any downtime.

With our current config we validated this scenario in our test environment. We observed that when Elasticsearch nodes are updated with new certificates issued by a different CA, they fail to join the existing cluster that is still operating on the older certificates. This causes the updated nodes to operate independently instead of forming a unified cluster.

For reference, our current Elasticsearch configuration related to xpack security is:

xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: <path_to_http_cert.pfx>
xpack.security.http.ssl.verification_mode: certificate

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.path: <path_to_transport_cert.pfx>
xpack.security.transport.ssl.verification_mode: certificate

Without seeing the exact error it’s impossible to say for sure, but I expect this is because you didn’t reconfigure the nodes to trust both CAs first.

@DavidTurner I'm following the truststore approach from this thread to enable safe certificate rotation. However, I've encountered an issue during testing that I need clarification on.

What I Did

Test deployment on 1 node (out of 18-node cluster):

  1. Created truststore containing both old and new CA certificates (complete chains: intermediate + root for both CAs)

  2. Updated config:
    xpack.security.transport.ssl.truststore.path: certs/transport-truststore.p12

  3. Stored password in Elasticsearch keystore as truststore.secure_password

  4. Restarted the node (Elasticsearch is up and running)

Problem

Node cannot rejoin cluster. The other 17 nodes (without truststore) reject its SSL handshake:

NodeDisconnectedException: failure when opening connection back
TransportException: handshake failed because connection reset

Rolling deployment creates the same problem we're trying to solve:

  • Node 1: Add truststore → Can't join cluster (other nodes reject it)

  • Node 2: Add truststore → Can't join cluster

  • Result: Cluster fragmentation, same as original issue

Question

The suggested approach says "Add the new CA to the transport TLS trust store on every node" - but during a rolling deployment of this change:

  • Can nodes with truststore communicate with nodes without truststore?

  • Or do all nodes need truststore deployed nearly simultaneously to avoid cluster split?

My concern: If rolling deployment causes nodes to leave the cluster during the transition (as we observed), this creates the same downtime issue we're trying to prevent.

Is there a configuration step I'm missing, or does this approach inherently require a very rapid deployment window across all nodes?

If you're using transport TLS then you must have a trust store on all nodes already?

I would expect there to be a good deal more information in the logs than this.

Thank you @DavidTurner! You're correct - let me clarify:

Current State (All Nodes):

Implicit trust via keystore embedded CA chain:

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.path: certs/XpackCert.pfx
xpack.security.transport.ssl.verification_mode: certificate
# No explicit truststore.path

The keystore contains the full G2 certificate chain (node cert + intermediate CA + root CA), which ES uses for both identity and trust validation.

What We're Adding:

Explicit truststore with both old (G1) and new (G2) CAs:

xpack.security.transport.ssl.truststore.path: certs/transport-truststore.p12

# Password in ES keystore: truststore.secure_password

Truststore contains G1 + G2 intermediate and root CAs (4 total).

Error Logs:

Here's the complete SSL handshake failure from the node with explicit truststore:

```
[2026-04-28T08:13:38,756][WARN ][o.e.c.c.JoinHelper       ] [QUERY00000C] last failed join attempt was 815ms ago, failed to join {MASTER000010}{22jbFJRARh-WHsIYlAkgZw}{fuJVWqdFRL2AE4G4cZ08Ng}{MASTER000010}{172.23.0.4}{172.23.0.4:9300}{m}{8.14.1} with JoinRequest{sourceNode={QUERY00000C}{VY9NBuh-R0KB4IxlB8L9AA}{TiR5KssJQdCHwshcHAEF9A}{QUERY00000C}{172.23.0.12}{172.23.0.12:9300}{8.14.1}}

org.elasticsearch.transport.RemoteTransportException: [MASTER000010][172.23.0.4:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.transport.NodeDisconnectedException: [QUERY00000C][172.23.0.12:9300][internal:cluster/coordination/join] failure when opening connection back from [{MASTER000010}] to [{QUERY00000C}]
Caused by: org.elasticsearch.transport.ConnectTransportException: [QUERY00000C][172.23.0.12:9300] general node connection failure
    at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.lambda$onResponse$2(TcpTransport.java:1132)
    [... full stack trace continues ...]
    at org.elasticsearch.xpack.core.security.transport.netty4.SecurityNetty4Transport$ClientSslHandlerInitializer.lambda$connect$0(SecurityNetty4Transport.java:350)
Caused by: org.elasticsearch.transport.TransportException: handshake failed because connection reset
    at org.elasticsearch.transport.TransportHandshaker.lambda$sendHandshake$0(TransportHandshaker.java:154)
```

The SSL handshake completes on the outbound connection (QUERY00000C → MASTER000010) but fails when MASTER000010 tries to open the connection back to QUERY00000C.

if need more logs let me know.

Question:

Is the change from implicit trust (keystore embedded CAs) to explicit trust (separate truststore file) incompatible with nodes still using implicit trust?

Or is there additional configuration needed to make this transition work in a rolling fashion?

It's unclear from your message but what I think you're saying is that MASTER000010 is configured like this:

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.path: certs/XpackCert.pfx
xpack.security.transport.ssl.verification_mode: certificate

and QUERY00000C is configured like this:

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.path: certs/XpackCert.pfx
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.truststore.path: certs/transport-truststore.p12

Then the message means that QUERY00000C can connect to MASTER000010 but MASTER000010 cannot connect to QUERY00000C. Is that right?

If so, I think I would expect to see log messages on MASTER000010 indicating more details about why.

@DavidTurner Yes, exactly right on the configuration. Here are the detailed logs from both nodes:

MASTER000010 configuration (implicit trust):

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.path: certs/XpackCert.pfx
xpack.security.transport.ssl.verification_mode: certificate

QUERY00000C configuration (explicit trust):

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.path: certs/XpackCert.pfx
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.truststore.path: certs/transport-truststore.p12

From MASTER000010 logs:

[2026-04-28T11:52:15,977][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [MASTER000010] client did not trust this server's certificate, closing connection Netty4TcpChannel{localAddress=/172.23.0.4:52740, remoteAddress=172.23.0.12/172.23.0.12:9300, profile=default}
[2026-04-28T11:52:15,983][WARN ][o.e.c.c.Coordinator      ] [MASTER000010] received join request from [{QUERY00000C}] but could not connect back to the joining node
org.elasticsearch.transport.ConnectTransportException: [QUERY00000C][172.23.0.12:9300] general node connection failure

From QUERY00000C logs:

[2026-04-28T11:50:33,042][INFO ][o.e.c.c.JoinHelper       ] [QUERY00000C] failed to join {MASTER000010} with JoinRequest{sourceNode={QUERY00000C}...}
org.elasticsearch.transport.RemoteTransportException: [MASTER000010][172.23.0.4:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.transport.NodeDisconnectedException: [QUERY00000C][172.23.0.12:9300] failure when opening connection back from [{MASTER000010}] to [{QUERY00000C}]
Caused by: org.elasticsearch.transport.ConnectTransportException: [QUERY00000C][172.23.0.12:9300] general node connection failure

Analysis:

The error "client did not trust this server's certificate" from MASTER000010 indicates that QUERY00000C is rejecting MASTER000010's certificate when MASTER000010 tries to connect back.

The truststore on QUERY00000C contains both certificate authority chains (4 certificates total: 2 intermediate CAs + 2 root CAs covering both old and new issuers). However, it appears QUERY00000C cannot validate MASTER000010's certificate.

Question: Is there a known incompatibility when mixing implicit trust (keystore-embedded CAs) and explicit trust (separate truststore file) within the same cluster during a rolling deployment? Or should both configurations work together seamlessly?

_ssl/certificate response from Query Node :

[
  {
    "path" : "certs/XpackCert.pfx",
    "format" : "PKCS12",
    "alias" : "node-cert-alias",
    "subject_dn" : "CN=Intermediate CA G2, O=CA Provider, C=US",
    "serial_number" : "1a2b3c4d5e6f7890abcdef1234567890",
    "has_private_key" : false,
    "expiry" : "2029-06-03T20:03:02.000Z",
    "issuer" : "CN=Root CA G2, O=CA Provider, C=US"
  },
  {
    "path" : "certs/XpackCert.pfx",
    "format" : "PKCS12",
    "alias" : "node-cert-alias",
    "subject_dn" : "CN=DigiCert Global Root G2, OU=www.digicert.com, O=DigiCert Inc, C=US",
    "serial_number" : "33af1e6a711a9a0bb2864b11d09fae5",
    "has_private_key" : false,
    "expiry" : "2038-01-15T12:00:00.000Z",
    "issuer" : "CN=DigiCert Global Root G2, OU=www.digicert.com, O=DigiCert Inc, C=US"
  },
  {
    "path" : "certs/XpackCert.pfx",
    "format" : "PKCS12",
    "alias" : "node-cert-alias",
    "subject_dn" : "CN=node.example.com",
    "serial_number" : "9f8e7d6c5b4a3210fedcba0987654321",
    "has_private_key" : true,
    "expiry" : "2026-07-19T13:35:57.000Z",
    "issuer" : "CN=Intermediate CA G2, O=CA Provider, C=US"
  },
  {
    "path" : "certs/XpackCert.pfx",
    "format" : "PKCS12",
    "alias" : "node-cert-alias",
    "subject_dn" : "CN=Root CA G2, O=CA Provider, C=US",
    "serial_number" : "2b3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e",
    "has_private_key" : false,
    "expiry" : "2029-06-19T23:59:59.000Z",
    "issuer" : "CN=DigiCert Global Root G2, OU=www.digicert.com, O=DigiCert Inc, C=US"
  },
  {
    "path" : "certs/transport-truststore.p12",
    "format" : "PKCS12",
    "alias" : "g1-intermediate-ca",
    "subject_dn" : "CN=Intermediate CA G1, O=CA Provider, C=US",
    "serial_number" : "3c4d5e6f7a8b9c0d1e2f3a4b5c6d7e8f",
    "has_private_key" : false,
    "expiry" : "2028-05-25T23:49:33.000Z",
    "issuer" : "CN=Root CA G1, O=CA Provider, C=US"
  },
  {
    "path" : "certs/transport-truststore.p12",
    "format" : "PKCS12",
    "alias" : "g1-root-ca",
    "subject_dn" : "CN=Root CA G1, O=CA Provider, C=US",
    "serial_number" : "4d5e6f7a8b9c0d1e2f3a4b5c6d7e8f90",
    "has_private_key" : false,
    "expiry" : "2042-07-18T23:00:23.000Z",
    "issuer" : "CN=Root CA G1, O=CA Provider, C=US"
  },
  {
    "path" : "certs/transport-truststore.p12",
    "format" : "PKCS12",
    "alias" : "g2-intermediate-ca",
    "subject_dn" : "CN=Intermediate CA G2, O=CA Provider, C=US",
    "serial_number" : "5e6f7a8b9c0d1e2f3a4b5c6d7e8f9012",
    "has_private_key" : false,
    "expiry" : "2029-06-03T20:03:02.000Z",
    "issuer" : "CN=Root CA G2, O=CA Provider, C=US"
  },
  {
    "path" : "certs/transport-truststore.p12",
    "format" : "PKCS12",
    "alias" : "g2-root-ca",
    "subject_dn" : "CN=Root CA G2, O=CA Provider, C=US",
    "serial_number" : "6f7a8b9c0d1e2f3a4b5c6d7e8f901234",
    "has_private_key" : false,
    "expiry" : "2029-06-19T23:59:59.000Z",
    "issuer" : "CN=DigiCert Global Root G2, OU=www.digicert.com, O=DigiCert Inc, C=US"
  }
]

Master Node Response :

[
  {
    "path" : "certs/XpackCert.pfx",
    "format" : "PKCS12",
    "alias" : "node-cert-alias",
    "subject_dn" : "CN=Intermediate CA G2, O=CA Provider, C=US",
    "serial_number" : "1a2b3c4d5e6f7890abcdef1234567890",
    "has_private_key" : false,
    "expiry" : "2029-06-03T20:03:02.000Z",
    "issuer" : "CN=Root CA G2, O=CA Provider, C=US"
  },
  {
    "path" : "certs/XpackCert.pfx",
    "format" : "PKCS12",
    "alias" : "node-cert-alias",
    "subject_dn" : "CN=DigiCert Global Root G2, OU=www.digicert.com, O=DigiCert Inc, C=US",
    "serial_number" : "33af1e6a711a9a0bb2864b11d09fae5",
    "has_private_key" : false,
    "expiry" : "2038-01-15T12:00:00.000Z",
    "issuer" : "CN=DigiCert Global Root G2, OU=www.digicert.com, O=DigiCert Inc, C=US"
  },
  {
    "path" : "certs/XpackCert.pfx",
    "format" : "PKCS12",
    "alias" : "node-cert-alias",
    "subject_dn" : "CN=master.example.com",
    "serial_number" : "abc123def456789fedcba9876543210",
    "has_private_key" : true,
    "expiry" : "2026-07-19T13:35:57.000Z",
    "issuer" : "CN=Intermediate CA G2, O=CA Provider, C=US"
  },
  {
    "path" : "certs/XpackCert.pfx",
    "format" : "PKCS12",
    "alias" : "node-cert-alias",
    "subject_dn" : "CN=Root CA G2, O=CA Provider, C=US",
    "serial_number" : "def789abc123456fedcba1234567890",
    "has_private_key" : false,
    "expiry" : "2029-06-19T23:59:59.000Z",
    "issuer" : "CN=DigiCert Global Root G2, OU=www.digicert.com, O=DigiCert Inc, C=US"
  }
]

Hi @DavidTurner

I would like to clarify a critical operational constraint that may not be obvious:

Cluster Size: Our production cluster has 1000+ nodes (testing is performed on an 18-node subset).

The Restart Problem:

Adding explicit truststore configuration requires:

  1. Updating elasticsearch.yml on each node

  2. Restarting the Elasticsearch service (config changes require restart)

  3. Node leaves cluster during restart

  4. Node rejoins cluster after restart

Why This Creates Downtime:

If we deploy truststore in a rolling fashion:

  • Nodes without truststore (implicit trust): 990 nodes online

  • Nodes with truststore (explicit trust): 10 nodes restarting/rejoining

When the 10 restarted nodes try to rejoin, they fail SSL handshake with the 990 nodes (as we've observed in testing). They remain isolated until ALL 1000+ nodes have truststore deployed.

Rolling deployment: Requires nodes with different trust configurations to coexist → SSL handshake fails → nodes can't communicate

Big deployment: Restart all 1000+ nodes simultaneously → guaranteed cluster-wide outage

This is why we're seeking confirmation: Is there ANY way to deploy truststore without requiring all nodes to restart simultaneously?

The original motivation was zero-downtime certificate rotation, but if truststore deployment itself requires downtime, it defeats the purpose.

I don't know of any such incompatibility. I think there's something wrong with your config preventing this from working, although I'm unclear what exactly or how to troubleshoot it further without access to all your cryptographic materials. I'm particularly suspicious about this concept of "implicit trust" you seem to have been using, as this isn't something I'd have expected to work in the first place.

Perhaps the first thing you should do is forget about the new CA and just get all the nodes running using a specified truststore.

Yes I understand this. I don't understand why you started out by restarting multiple nodes at once, you can test the process with a single node and back out any changes if you find problems.

Hi @DavidTurner ,

Thanks for getting back to me. Let me clarify a couple of points to clear up the confusion regarding the setup and how the test was performed.

1. Clarifying "Implicit Trust" When I used the term "implicit trust," I was referring to Elasticsearch's default fallback behavior. When xpack.security.transport.ssl.truststore.path is not explicitly defined in elasticsearch.yml, the node automatically relies on the CAs contained within the xpack.security.transport.ssl.keystore.path file to act as the truststore. So, the rest of the nodes in the cluster are just running on this standard keystore fallback configuration.

2. Testing on a Single Node I entirely agree with testing on a single node, and that is exactly what we are doing on our test cluster. The logs and behavior I provided previously are the result of configuring a separate truststore file on exactly one node (QUERY00000C / 172.23.0.12).

The Current State of the Single-Node Test: For this single query node (QUERY00000C), I added the new transport-truststore.p12 file configuration. To ensure backward compatibility with the rest of the cluster, I added the current issuer (the CA currently trusted and used by the rest of the nodes running on the keystore fallback) into this new, separate truststore.

The rest of the cluster was left entirely untouched.

Despite having the current issuer in its new truststore file, QUERY00000C is still rejecting the certificate presented by MASTER000010 during the SSL handshake.

Given that the query node has the current CA in its dedicated truststore, shouldn't it be able to validate the master node's certificate? Are there any known problem or strict validation checks when a node using a separate truststore file handshakes with a node relying purely on the keystore?

No, nothing known. The only thing I can think of is that the truststore you've created doesn't match up with the existing CA or the node's certificates somehow.

@DavidTurner - We've completed thorough validation and identified the root cause.

Important Context:

We are NOT attempting certificate migration, all nodes currently use certificates from the same issuer. We're simply testing whether we can move from keystore-only to truststore with the current certificates.

Test Configuration:

  • MASTER node: keystore-only - current production configuration

  • QUERY node: keystore + truststore - testing trust with same root CAs

Truststore Verification:

We verified the truststore contains the exact same root CAs as the keystore by comparing SHA256 fingerprints between both files - they match perfectly (verified byte-for-byte).

MASTER node keystore:

- Enterprise Root CA G2: <SHA256 verified>

- Public Root CA (DigiCert Global Root G2): <SHA256 verified>

QUERY node truststore :

- enterprise-root-g2: <matches keystore> [Matches]

- digicert-root-g2: <matches keystore> [Matches]

The truststore content is verified correct.

Root Cause - Extended Key Usage:

QUERY node logs:

[2026-05-04T12:13:55,225][WARN ][o.e.c.s.DiagnosticTrustManager] [QUERY00000C] failed to establish trust with client at [<unknown host>]; the client provided a certificate with subject name [CN=node.example.com,O=Enterprise Corp,L=City,ST=State,C=US], fingerprint [<redacted>], keyUsage [digitalSignature, keyEncipherment] and extendedKeyUsage [serverAuth]; the certificate is valid between [2026-04-10T14:35:57Z] and [2026-07-19T13:35:57Z] (current time is [2026-05-04T12:13:55.225233900Z], certificate dates are valid); the session uses cipher suite [TLS_AES_256_GCM_SHA384] and protocol [TLSv1.3]; the certificate is issued by [CN=Enterprise TLS CA G2,O=CA Provider,C=US]; the certificate is signed by (subject [CN=Enterprise TLS CA G2,O=CA Provider,C=US] fingerprint [<redacted>]) signed by (subject [CN=Enterprise Root CA G2,O=CA Provider,C=US] fingerprint [<redacted>] {trusted issuer}) signed by (subject [CN=DigiCert Global Root G2,OU=www.digicert.com,O=DigiCert Inc,C=US] fingerprint [<redacted>] {trusted issuer}) which is self-issued; the [CN=DigiCert Global Root G2,OU=www.digicert.com,O=DigiCert Inc,C=US] certificate is trusted in this ssl context ([xpack.security.transport.ssl (with trust configuration: StoreTrustConfig{path=certs/transport-truststore.p12, password=<non-empty>, type=PKCS12, algorithm=PKIX})])
sun.security.validator.ValidatorException: Extended key usage does not permit use for TLS client authentication

MASTER node logs:

[2026-05-04T11:59:47,540][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [MASTER000010] client did not trust this server's certificate, closing connection Netty4TcpChannel{localAddress=/10.0.0.4:49558, remoteAddress=10.0.0.12/10.0.0.12:9300, profile=default}
[2026-05-04T11:59:47,543][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [MASTER000010] client did not trust this server's certificate, closing connection Netty4TcpChannel{localAddress=/10.0.0.4:49560, remoteAddress=10.0.0.12/10.0.0.12:9300, profile=default}
[2026-05-04T11:59:47,545][WARN ][o.e.c.c.Coordinator      ] [MASTER000010] received join request from [{QUERY00000C}{<node-id>}{<ephemeral-id>}{QUERY00000C}{10.0.0.12}{10.0.0.12:9300}{8.14.1}{<attributes>}] but could not connect back to the joining node
org.elasticsearch.transport.ConnectTransportException: [QUERY00000C][10.0.0.12:9300] general node connection failure
    at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.lambda$onResponse$2(TcpTransport.java:1132) ~[elasticsearch-8.14.1.jar:?]
    at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:62) ~[elasticsearch-8.14.1.jar:?]

client did not trust this server's certificate

Analysis:

  • All node certificates have extendedKeyUsage: [serverAuth] but lack clientAuth
  • Elasticsearch transport layer uses mutual TLS requiring both server and client authentication
  • With keystore-only configuration: these certificates work fine (lenient validation)
  • With truststore: strict Extended Key Usage validation fails in both directions

Question:
Does truststore configuration enforce stricter Extended Key Usage validation than keystore-only mode? If so, does this mean all node certificates must be reissued with both serverAuth and clientAuth before deploying explicit truststore in production?

This creates a challenge: We need truststore for zero-downtime certificate rotation, but deploying truststore requires reissuing all certificates first (which requires downtime).

Is there a configuration option to relax Extended Key Usage validation when using explicit truststore, or is certificate reissuance the only path forward?

@DavidTurner Just checking in to see if you've had a chance to look at the post above. Thanks again for your help so far!

I've spent some time looking into this.

I believe the latest issue you're seeing is that the transport protocol requires mTLS, but the certificates you're using have Extended Key Usage only for serverAuth. This means the JVM rejects the certificate in the way you're seeing.

To make this ultimately work, your new certificates will need to be issued with Extended Key Usage including clientAuth and serverAuth (or no EKU at all).

I don't think there's any way you can switch to an explicit truststore using your existing certs, but it may be possible to:

  • Update every node’s current transport keystore so its trust material includes both old and new CA chains.
  • Reissue and rotate each node’s transport certificate one node at a time, making sure the new certs are valid for transport mTLS (clientAuth + serverAuth, or no EKU restriction).
  • Once the whole cluster is running on the new cert profile, move to an explicit truststore.path (not strictly necessary, but would simplify this process in the future)
  • Remove the old CA from trust