Unable to install multi-node cluster - Skipping security auto configuration because this node is configured to bootstrap or to join a multi-node cluster, which is not supported., with exit code 80

I am trying to perform the initial installation of a two node Elasticsearch cluster. This is on Ubuntu 22.04 with Elasticsearch version 8.13 installed via RPM.

This is the contents of the file /etc/elasticsearch/elasticsearch.yml on my master node:

cluster.name: lab
cluster.initial_master_nodes: ["eflow01.mydomain.com"]
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
http.host: 0.0.0.0
http.port: 9200
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12
xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12

The service starts fine on this first node:

$ sudo systemctl status elasticsearch
● elasticsearch.service - Elasticsearch
     Loaded: loaded (/lib/systemd/system/elasticsearch.service; disabled; vendor preset: enabled)
     Active: active (running) since Thu 2024-04-25 15:52:24 UTC; 3min 24s ago
       Docs: https://www.elastic.co
   Main PID: 614816 (java)
      Tasks: 110 (limit: 19137)
     Memory: 8.4G
        CPU: 1min 11.150s
     CGroup: /system.slice/elasticsearch.service
             ├─614816 /usr/share/elasticsearch/jdk/bin/java -Xms4m -Xmx64m -XX:+UseSerialGC -Dcli.name=server -Dcli.script=/usr/share/elasticsearch/bin/elasticsearch -Dcli.libs=lib/tools/server-cli -Des.path.home=/usr/share/elasticsearch -Des.path.conf=/etc/elasticsearch -Des.distribution>
             ├─614876 /usr/share/elasticsearch/jdk/bin/java -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -Djava.security.manager=allow -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -D>
             └─614904 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller

Apr 25 15:52:08 eflow01.mydomain.com systemd[1]: Starting Elasticsearch...
Apr 25 15:52:11 eflow01.mydomain.com systemd-entrypoint[614816]: Apr 25, 2024 3:52:11 PM sun.util.locale.provider.LocaleProviderAdapter <clinit>
Apr 25 15:52:11 eflow01.mydomain.com systemd-entrypoint[614816]: WARNING: COMPAT locale provider will be removed in a future release
Apr 25 15:52:24 eflow01.mydomain.com systemd[1]: Started Elasticsearch.

On the second node, I tried using the default configuration file provided by the RPM but that didn't work so this is the version I have present there:

cluster.name: lab
cluster.initial_master_nodes: ["eflow01.mydomain.com"]
discovery.seed_hosts: eflow01.mydomain.com
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
http.host: 0.0.0.0
http.port: 9200

On the second node I attempt to start the service for the first time using this command, providing the enrollment token from the first node:

$ sudo /usr/share/elasticsearch/bin/elasticsearch --enrollment-token [ REDACTED ]

ERROR: Skipping security auto configuration because this node is configured to bootstrap or to join a multi-node cluster, which is not supported., with exit code 80

Thanks for any help you can provide!

I guess the main question I have is what should the config file look like on all member (non-master) nodes? I tried the default provided by the Ubuntu RPM and it didn't work, and have tried modifying it in various ways, that doesn't work either.

If I use the factory supplied config file, and then the command bin/elasticsearch-reconfigure-node I get an error about it not able to locate the master node.

Could someone please provide a sample config file to use on member nodes?

I think you need to delete cluster.initial_master_nodes. From these docs:

IMPORTANT: After the cluster forms successfully for the first time, remove the cluster.initial_master_nodes setting from each node’s configuration. Do not use this setting when restarting a cluster or adding a new node to an existing cluster.

Do I need to customize the config file at all on the secondary node, or does the command elasticsearch-reconfigure-node do that for me? Does the enrollment token have the cluster name and master node name and what not?

Can you please provide sample initial configuration file for a member node?

The documentation is very confusing to me, and feels a bit disjointed. My use case is a fresh install of a cluster with two nodes. My assumption is that the first node will be the acting initial master - the first node of a given cluster, and the second one will join as a member.

I have the first node (acting master) setup and working fine, as far as I can tell. However, I cannot get the second node (member node) to join the cluster created by the first.

I am following the installation documentation here:

Where I generate an enrollment token on my acting master node, and then supply that to the command elasticsearch-reconfigure-node on my member node. However, that node is not actually joining the cluster, per the log file.

I also see this documentation for adding and removing nodes in your cluster here:

Which has completely different steps.

I am still not able to get this working. If I use the auto-bootstrapping method, do I need the enrollment token at all? It appears if I modify the config file at all on the member node, to enable boot-strapping, then I cannot use the command elasticsearch-reconfigure-node.

I am very confused, can someone please provide a working sample of a master-eligible node configuration file and a member node configuration file and the high-level steps?

You should not be trying to use an enrollment token if you have already configured the node manually. You should just try to start it up.

The enrollment process simply configures the node for you.

Since you configured cluster.initial_master_nodes in the elasticsearch.yml file, you cannot use the enrollment token, you just to simple start it with normally or using systemctl.

Also, keep in mind that you should also make this second node a data/ingest node only, there is no resilience on a 2 node cluster, so can have only one node configured as master.

Try to change the elasticsearch.yml of this second node to this one:

cluster.name: lab
cluster.initial_master_nodes: ["eflow01.mydomain.com"]
discovery.seed_hosts: eflow01.mydomain.com
node.roles: [ data, ingest ] 
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
http.host: 0.0.0.0
http.port: 9200
xpack.security.enabled: true
xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12
xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12

Make sure to have all the certificates in the second node as well.

And start it with systemctl start elasticsearch.

Thanks for the replies all. My immediate question is, how can I validate the 2nd node has joined the cluster properly?

I would like to use the enrollment token method. Here are the steps I am performing:

  1. Install elasticsearch on the 2nd node - NOT modifying the elasticsearch.yml configuration file and NOT starting the software
  2. Using enrollment token from my master node I run the command elasticsearch-reconfigure-node --enrollment-token. This completes successfully.
  3. Start the software on the 2nd node using systemctl, this completes successfully
  4. Attempting to test that the 2nd node has joined the cluster correctly I am running the command curl --cacert /etc/elasticsearch/certs/http_ca.crt -u elastic:PASSWORD https://localhost:9200. I am using the password the 2nd node provided me when I installed the software.

I am not sure why but I cannot post the full error message I am getting, but basically when I run that curl command it says "unable to authenticate user [elastic] for REST request".

I then try to re-generate the password on the 2nd node and get this error:

$ sudo /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic -s
ERROR: Failed to determine the health of the cluster. Unexpected http status [503], with exit code 65

I can re-generate the password on the master node and run that curl command successfully. Trying that new password on the 2nd node produces the same error, when trying to hit itself.

Ok I found this command, which I can run on the master node:

$ sudo curl -X GET --cacert /etc/elasticsearch/certs/http_ca.crt -u elastic:PASSWORD "https : / / localhost:9200/_cluster/health?pretty"
{
"cluster_name" : "lab",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 1,
"active_shards" : 1,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}

So it appears that my 2nd node is not actually joining the cluster, even though I provided the proper enrollment token successfully, and restarted the software.

I see these entries in my log file /var/log/elasticsearch/elasticsearch.log :

[ REDACTED A BUNCH OF STUFF BECAUSE SITE WONT LET ME POST FOR SOME REASON ] failed: remote cluster name [lab] does not match local cluster name [elasticsearch]

Which takes me all the way back round to my prior question, what should the config file look like on the 2nd node? Because apparently the cluster name is not something the enrollment token configures?

Also just in case, here is the config file I am using on my master node:

node.name: eflow01.mydomain.com
node.roles: [ master, data ]
cluster.name: lab
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
http.host: 0.0.0.0
http.port: 9200
transport.host: 0.0.0.0
transport.port: 9300
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12
xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12
cluster.initial_master_nodes: eflow01.mydomain.com  ## remove this line after initial start

It looks like you have not set discovery.seed_hosts, which is how Elasticsearch identifies which nodes to initially connect to.