Problems Joining a Cluster

Wow,
so not using cluster.initial_master_nodes and using discovery.seed_hosts worked!
I did remove all the files in my ES data folder prior to starting just in case there was any residue from previous attempts. ES recreated all the files and this is the output of my cluster status now:

{
  "cluster_name" : "clustername",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 1,
  "active_shards" : 2,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

I restarted host1 and host4 took over as master! woohoo!

ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
10.2.1.46            0          27   0    0.00    0.02     0.00 cdfhilmrstw -      host1
10.2.1.49            4          28   0    0.00    0.00     0.00 cdfhilmrstw *      host4
10.2.1.47            2          28   0    0.00    0.01     0.00 cdfhilmrstw -      host2

So now, I'm onto adding 3 more nodes. I saw someone used the yml setting:
node.roles [master, data]
Is that valid to use to note which nodes can take over master?
I also tried to set a cluster wide setting for replicas between the nodes and it failed. I may have to open a new topic for that.

Hurrah!

I'd suggest you leave node.roles as the default (i.e. don't mention it in elasticsearch.yml) which means that all nodes have all roles. You only need set it if you have a pressing need for different nodes to have different roles, and if you've only got 6 nodes then they probably all want to do everything.

David,
Great suggestion. I split my 6 nodes between 2 cabinets, so that was my logic to have some multiple masters in another cabinet.
A couple more questions for you.
I noticed only 1 node says master. How do I get host2 and host4 to also be a master node at the same time? Or can only 1 be master at a time?
When I add the other 3 nodes, should make the discovery.seed_hosts have the original 3 that I wanted for master or add all hosts like this:

discovery.seed_hosts: ["host1", "host2", "host4", "host3", "host5", "host6"]

Thanks,
Bryan

The m in this alphabet soup means the nodes are all master-eligible. Only one of them is elected as master at once, but if it fails then another master-eligible one will take over.

discovery.seed_hosts should contain the addresses of all the master-eligible nodes in the cluster, so yeah mention all 6. It'll mostly work with fewer, at least as long as one or more of those three original nodes is still available.

got it.
I added the fourth node, following these steps and copying the cert file from the CA over and the node added to the cluster on the first try! I'm stoked.

2 Likes

RPM packages are expected to do all their installation configuration at install time, so the Elasticsearch installer configures security during package installation.

The RPM install guide has a section on how to reconfigure the node to join an existing cluster.

1 Like

Hi TimV,
Thanks for checking this post out and responding. I followed the steps to a T multiple times on multiples and it just plain doesn't work. The enrollment-token method makes a bunch of assumptions that the document never covers, including the fact that none of the 4 steps indicates to modify the elasticsearch.yml to include the node name and cluster name which are a requirement to join a cluster.
As I have documented above in this post, everytime I try to use the enrollment-token method in the link you posted I get the error that "...it appears that the node is not starting up for the first time...." These virgin, newly installed systems, there is no legacy install of ES anywhere. I would humbly request given your position in engineering that the documentation be updated with different steps and the other prerequisites that need to be completed prior to running ES for the first time with the enrollment token(such as file permissions and contents of the yml file).
If you want to do a webex or screen share, I would be happy to show you it live.
Thanks,
Bryan

I meant multiple times on multiple servers.

As far as I can tell, you didn't follow the steps in that document

You wrote

$ bin/elasticsearch --enrollment-token

But that docs says to run

/usr/share/elasticsearch/bin/elasticsearch-reconfigure-node --enrollment-token <enrollment-token>

Those are different commands.

Hi TimV,
So your main guide: Install Elasticsearch from archive on Linux or MacOS | Elasticsearch Guide [8.12] | Elastic
Explicitly says halfway down under: "Enroll nodes in an existing cluster"
to do this:
From the installation directory of your new node, start Elasticsearch and pass the enrollment token with the --enrollment-token parameter.

bin/elasticsearch --enrollment-token enrollment-token <enrollment-token>

Then there is this page:

that says to use:

bin/elasticsearch --enrollment-token enrollment-token <enrollment-token>

So that's two current documents that say to use the elasticsearch binary.

However, I saw that documentation with that option as well before I ever posted to this forum. So I started from scratch on a new server and ran the elasticsearch-reconfigure-node binary and it doesn't work. I spent a week trying to get that binary to work on a new node that had never been started. I would be happy to show you on a brand new system live.
Thanks.

But you aren't installing from an archive. You downloaded the RPM, so you need to follow the instructions on the RPM page. You can't jump into a page about installing the .tar.gz package, skip the first few steps and expect the subsequent steps to work.

I fully accept that it can be hard to determine which docs are relevant to your scenario, and many of the doc pages assume that you're starting from a particular point that might not be universal. The Start the Elastic Stack with security enabled automatically page provides the wrong information for people who install from RPM or deb packages.

My goal here was to provide the background information for why the RPM install populates the certificates before first use, and how to overcome the error you were reporting

ERROR: Skipping security auto configuration because it appears that the node is not starting up for the first time. The node might already be part of a cluster and this auto setup utility is designed to configure Security for new clusters only.

reconfigure-node is the step you need to apply if you use the RPM. If you ran into a problem with that then we can try to help, but we can't guess what steps you tried or what errors you encountered.

TIL, thanks Tim

... and, worse, it doesn't mention that it only applies to archive installations. It just says this in its prerequisites, linking to a page that offers an RPM for download:

Download and unpack the elasticsearch package distribution for your environment.

Thanks David and TimV,
I appreciate you both trying to sort through this. This topic chain has definitely made the difference getting our production ES cluster online. As David mentions, it would help if the other documentation references to joining a cluster indicated both options, for RPM and non-RPM joins.
It would also be helpful when running the reconfigure-node binary to indicate that the elasticsearch.yml still need to be modified with cluster name, network host, etc. The documentation gives the impression that it doesn't need to be modified. Maybe adding a note after the line "This node will be reconfigured to join an existing cluster, using the enrollment token that you provided. This operation will overwrite the existing configuration...."
Thanks team!
Bryan

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.