Certificate added via GUI only distributed to 2 of 3 nodes

Hi there,

we have created a 3 node ECE Cluster. Almost everything is working fine. Except the certificates.
We have created the pem as described in the documentation and uploaded it via the GUI in Platform -> Settings

I realized that the node had still the precalculated elastic certificate. But after I tried logging in via the other nodes everything worked. Except for the first node. It has still the old cert.

When I look into the certificate chain via the browser it shows me the right Information but my browser does not like it.

Maybe you can help.

Greetings
Malte

I have tested it also via commandline
openssl s_client -showcerts -connect localhost:12443 < /dev/zero

this shows me:
CONNECTED(00000003)
depth=2 CN = elastic ce master
verify error:num=19:self signed certificate in certificate chain

Certificate chain
0 s:/CN=elastic ce cloudui cloudui
...

on the other 2 nodes theres is the right certificate shown

That is really strange, I have never seen this happen before

Did you also upload a proxy cert and is that working on all hosts? (eg port 9243)?

Just to double check - are all 3 hosts definitely part of the same platform? Or did it somehow get created as 2 seperate ECE platforms, one consisting of node1, and the other consisting of nodes 2 and 3?

I guess I would start with docker rm -f frc-cloud-uis-cloud-ui (maybe do a docker logs on that container first to see if there are any interesting errors) to restart the service that manages that cert, if that doesn't work then try also restarting frc-client-forwarders-client-forwarder - as well as re-restarting cloud-ui)

Alex

After a the docker rm I got this in the docker logs:
[!] error queue: ssl_rsa.c:701: error:140DC002:SSL routines:SSL_CTX_use_certificate_chain_file:system lib
[!] error queue: bss_file.c:404: error:20074002:BIO routines:FILE_CTRL:system lib
[!] SSL_CTX_use_certificate_chain_file: bss_file.c:402: error:02001002:system library:fopen:No such file or directory
[!] Service [server-0]: Failed to initialize TLS context

After removing the docker and a reboot everything is working.

But if you have an idea what could have been the cause of it, let me know.

Thanks

That error happens if the certs specified in the config file of the process that terminates the SSL connection don't exist (stunnel currently, moving to haproxy at some later point in the future)

I guess there's some weird race condition where stunnel is starting up before we've written the certs to disk, I cannot think of a reason why this wouldn't have happened the previous 60 million times a cert has been uploaded across the ECE userbase though :frowning:

I'll open an issue internally to track this

Alex

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.