Docs for manually enrolling cluster nodes without enrollment tokens?

I'm trying to set up an Elasticsearch cluster using auto-renewing certs with certmonger from our enterprise CA. I'm running into a problem that has been discussed in many threads, where the answer always seems to be some variation of, "Just set up the cluster manually, without using enrollment tokens."

Is there a walkthrough doc anywhere for setting up a cluster manually, without using enrollment tokens?

Error I'm getting when trying to generate an enrollment token:

# /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s node
Unable to create enrollment token for scope [node]
ERROR: Unable to create an enrollment token. Elasticsearch node HTTP layer SSL configuration Keystore doesn't contain any PrivateKey entries where the associated certificate is a CA certificate

Forum threads and a bug entry that I've read which all seem to say "if you use an enterprise CA, you'll have to set up your cluster manually" (though most of them are about Kibana, which I haven't gotten to yet):

Creating an enrollment token in Elastic fails due to PrivateKey error - Elastic Stack / Elasticsearch - Discuss the Elastic Stack

Import CA Cert as PrivateKeyEntry to HTTP Keystore - Solve Unable to create enrollment token Error - Elastic Stack / Elasticsearch - Discuss the Elastic Stack

Generating enrolment token for Kibana should not require the CA key · Issue #89017 · elastic/elasticsearch · GitHub

Create enrollment for kibana impossible with certificate Lets Encript - Elastic Stack / Elasticsearch - Discuss the Elastic Stack

Configure TLS production environment with own CA - Elastic Stack / Elasticsearch - Discuss the Elastic Stack

I have also read the doc about generating CSRs, but it doesn't seem to have any mechanism for auto-renewing the certs, so it doesn't solve the problem I'm trying to solve.

In theory all I need is a walkthrough of how to set up a cluster without using enrollment tokens, so it would be great if someone could point me to one. Thanks!

Hi @andrew.klaassen Welcome to the community.

I would set these 2 settings in the elasticsearch.yml false false

Then follow the 7.17 documentation for adding additional nodes as a good starting point.

At the highest level
You need to bind the transport to the network.

Use the same cluster name.

And set the discovery settings.

Ohh and get all your certs right :slight_smile:

Thanks for the guidance, that was helpful.

For anyone looking at this for future reference, this is what I ended up doing (on an RH8 machine which was already joined to our Active Directory domain):

Installed elasticsearch packages with yum, but did not start it. (Starting it before setting up elasticsearch.yml resulted in a single-node cluster that I couldn't convince to become a multi-node cluster, except by deleting the contents of /var/lib/elasticsearch.)

Copied our Windows enterprise CA public certs to /usr/share/pki/ca-trust-source/anchors and ran update-ca-trust so that getcert would trust our cert authority.

Installed certmonger, cepces-certmonger, and cepces. Set "server" in /etc/cepces/cepces.conf file to our subordinate cert authority DNS name.

On our Windows sub CA, copied the existing Windows "Computer" cert template to an "ELK" cert template. Changed to these settings:

  • Compability -> Compatibility Settings-> Certification Authority: Windows Server 2016, Certificate recipient: Windows 10 / Windows Server 2016
  • Subject Name -> Supply in the request

Requested a cert on the Linux ELK server:

# getcert request -c cepces -T ELK -I ELKCertificate -k /etc/elasticsearch/certs/private.key -f /etc/elasticsearch/certs/public.crt --key-owner=elasticsearch --cert-owner=elasticsearch --ip-address=<ip address> --dns=<short hostname> --dns=<fully qualified hostname>

Saw that there was no cert, looked at /var/log/messages, ran the suggested SELinux commands, ran "getcert resubmit -I ELKCertificate", looked at /var/log/messages for SELinux errors, ran suggested SELinux commands, lather-rinse-repeat until SELinux errors stopped and a cert appeared.

Copied our Windows enterprise sub CA public cert to /etc/elasticsearch/certs.

Opened the firewall for ports 5601, 9200, and 9300 with firewall-cmd.

Changed elasticsearch.yml to look something like this: ak-test-elk ak-test-elk-01 /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
discovery.seed_hosts: ["ak-test-elk-01", "ak-test-elk-02", "ak-test-elk-03"]
cluster.initial_master_nodes: ["ak-test-elk-01", "ak-test-elk-02", "ak-test-elk-03"] true false
  enabled: true
  key: certs/private.key
  certificate: certs/public.crt
  enabled: true
  verification_mode: certificate
  key: certs/private.key
  certificate: certs/public.crt
  certificate_authorities: certs/subCA.cer false

Started the elasticsearch service on each node, saw that it was finally okay and not spitting out endless errors, then removed "cluster.initial_master_nodes" from elasticsearch.yml.

And now I seem to have a cluster that's communicating using our internal certs.

1 Like