How to change or replace the SSL Certificate used by Fleet Server

Hello,

I'm looking for the steps that need to be done to change or replace the SSL Certificate for a existing Fleet Server and I could not find anything in the documentation.

All documentation that I found is about how to use a SSL certificate while installing the Fleet Server, which is not the case, the fleet server is running and the certificate needs to be changed or replaced.

There was a similar question about it, but it got no answers from Elastic.

I opened a ticket with support, but the first answer I got was directing me to a documentation on how to install fleet server, which does not solves the issue, I'm still wait on further answers from support.

Being able to easily change/replace the certificate used by Fleet Server is a hard requirement to use it on Production, not sure why the documentation of it is not easily available or even if it exists.

I ended up just replacing the contents of the certificate and key files with the new certificate and key, then restarting the agent service. That's the only way I found to do this.

Hello @stephenb,

Sorry to tag you, but I could not get any answer from Elastic on this, I have an open ticket with support for a week but no answer on the steps to replace the SSL certificate of a running Fleet Server.

All the links I was sent are about using a certificate while installing a Fleet Server, which is not the case, the Fleet Server is running and already has policies and agents attached, the user can not expect to reinstall the fleet server every time it needs to change or replace the SSL certificate.

Replacing a SSL certificate is a common use case so there needs to be an easy way to replace the certificate used by the Fleet Server, but I could not find anything in the documentation nor through support.

Can you get some insight internally?

I pinged internally let's see if we get an answer.

1 Like

@leandrojmp
Did you get the update in the ticket.
I had an internal discussion, LMK if you want to chat offline.

Hello @stephenb,

Yeah, I got an answer, it seems that at the moment there is no public documentation about how to change the certificate.

The orientation I was given was to replace the files on the server and restart the Fleet Server, but this didnt work as expected and there was no helpful logs nor any documentation.

Since we are in the earlier stages of implementation we bit the bullet and reenrolled the fleet server and the agents, now using a certificate signed by a know CA, which we expect that won't give us any issue when we need to replace it before expiration date earlier next year.

Do you have any extra information that you may share on a DM?

Hello @stephenb, Just to give a better explanation on this.

We had a fleet server, let's say that the host was named fleet-server, then we created a self-signed certificate for this fleet server which was valid for the following:

  • ip address of the server
  • the hostname fleet-server (it is on the hosts file of every VM)
  • the hostname fleet-server.company.domain (it is also on the hosts file of every VM)

On the fleet ui settings our default fleet host was configured as https://fleet-server:8220 and we installed the agents using https://fleet-server:8220 as the URL for the fleet server and also with the parameter --insecure, since we used a self-signed certificate.

Recently we decided to deploy Elastic Agents to replace another log collector and we decided that it would be better to use a company certificate, signed by a know CA, to avoid using the --insecure parameter.

We have a chain certificate and key for *.company.domain and would need to just replace this on the fleet server and we assumed that this would not impact the current agents, because they were installed using --insecure to not validate the certificate.

Before replacing the certificate we changed the host of the fleet server on fleet we to use https://fleet-server.company.domain, which was valid in the self-signed certificate and would also be valid with the company certificate.

This worked without any issue and we could check that the agents were communicating to the fleet server through https://fleet-server.company.domain.

Since this domain would be valid with both certificates, we decided to change the local files in the fleet server, so we stopped the fleet server, changed the files certificate and key files, we used before a self-generated CA and in this case we would use a know CA, so there was no CA file to replace, but since our certificate was a chain certificate, we replaced the CA also with this file.

I'm not sure this would work anyways, because it wasn't a simple certificate change, we were moving from using a self-generated CA to a know CA.

But the main issue was that we changed the files and restarted the Fleet Server, it never came back online, in the log file we could see that it was still trying to connect to https://fleet-server:8220, this configuration was not present anywhere else as we have changed it to https://fleet-server.company.domain:8220 while still using the self-signed certificate.

The error logs were pretty clear in this case saying that the current certificate was valid for *.company.domain and not for fleet-server.

It seems that this setting on the Fleet UI is not applied to the fleet server itself, I think that this can be replicate with the following steps:

  • create a self-signed certificate valid for host and host.some.domain
  • install fleet server using only the host for the URL and also use only https://host:8220 on the fleet UI for the default host of fleet server.
  • change the default host of fleet server to https://host.some.domain:8220.
  • change the certificate with one that is only valid to host.some.domain and check if the fleet will work or not.

In the end we bit the bullet and decided to re-enroll everything with the correct certificate as we had just a couple of agents running.

But I don't think this would be a valid thing to do if we had hundreds or thousands of Agents.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.