Updating node certificate on existing cluster

Hello there,

How to Update node certificate on existing cluster?
On-Premise using Version 7.8.0

Node certificate about to expiry. The trusted chain (CAs) remains the same just new node certificate being issued.

Where do I find the official documentation (links) for doing this?

Since I want the services to remain uninterrupted, I assume that I need to perform a rolling restart rather than a full cluster restart. My mental block is how is this going to work? As the certificate are different how would the updated node know to re-join the cluster? Do I have to start with the master nodes before the data nodes? What are the steps?

I would like to gather as much information before I do this. Can someone point me to the right resources? Thanks!

This makes the problem a lot simpler. Planning for CA changes can require a few steps, but just updating node certs is pretty easy.

Since I want the services to remain uninterrupted, I assume that I need to perform a rolling restart rather than a full cluster restart.

Not necessarily. Elasticsearch monitors the SSL resources for updates so you can just copy the new certificate and key files (or keystore) into place and the node will pick it up.

As the certificate are different how would the updated node know to re-join the cluster?

There's nothing that should be necessary here.
If you restart the node, then it joins a cluster by

  1. connecting to the seed hosts
  2. establishing a trusted SSL connection
  3. verifying that those hosts are part of a cluster with the same name (cluster.name in elasticsearch.yml)

Steps 1 & 3 don't depend on the certificates at all, so nothing changes there.
Step 2 will work without any changes, because you haven't switched CAs.

If you update the certificates in place, then any existing connections to the cluster will be fine. Certificate verification on takes place when a connection is first established so there's not need to force the node to re-join.

What are the steps?

First decide whether you want to do an in place update or a rolling restart. Both can work. The rolling restart is a little bit safer (for reasons I will explain below) but has all the complications of restarting nodes (disabling allocation, clients being disconnected, etc). An in place upgrade avoids the restart issues, but you need to monitor the nodes for some time after the change to make sure everything worked correctly.

There are 2 reasons why in place upgrades have slightly more risk:

  1. If you use PEM files, then your certificate and key are in different files so you need to update them simultaneously or the node may experience a temporary period where it cannot establish new connections.
  2. Updating the certificate & key does not automatically force existing connections to be refreshed, so if you do something wrong the node may look like it's working correctly, but that is because it still has existing connections. It's possible to make a mistake that leaves the node in a state where it cannot establish new connections with other nodes (and therefore cannot recover from a network outage or a node restart).

Warning: These steps assume that the CA isn't changing. I know that's true for you, it might not be true for everyone who reads this post, and I don't want them to get into trouble.

Via a rolling restart

  • Follow all the steps of a rolling restart.
  • While each node is stopped (the Perform any needed changes step), switch the node's SSL certificate and key. You can do this either by:
    1. Change elasticsearch.yml to point to new file locations
    2. Change the contents of the existing SSL resources to contain your new certificate and key.

It really is that simple. If you are using PEM then you need to change the .key and .certificate files. If you are using PKCS#12 or JKS then you just have 1 file to change, but it is recommended that you replace the existing entry rather than adding a second entry.

Via in-place certificate updates

  • For each node, update the contents of the existing SSL resources to contain the new certificate & key. For example, if you are using PEM, and you have:
    xpack.security.transport.ssl.certificate: server.crt
    xpack.security.transport.ssl.key: server.key
    
    then you would prepare two new files, new.server.crt and new.server.key and then switch them out in a single command line:
    chmod u+w server.crt server.key && \
      mv new.server.crt server.crt && \
      mv new.server.key server.key
    
  • Then monitor the logs to make sure that the node reports that it reloaded the SSL context, and watch for any error.
  • You can also use the _ssl/certificates API on each node to verify that it has loaded new certificates.
  • At the end of of updating each node, you may wish to restart 1 single node to verify that it successfully establishes a connection with every other node in the cluster. This will increase the confidence that the change was completed successfully.
4 Likes

Thanks Tim,
Appreciate the helps , you have given me the information I need to process.
Thanks again!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.