Help forming a cluster

Hey I am loosing my mind over this, I have 3 VMs on the same ESXI host in the same subnet. All have firewalld disabled and SElinux disabled. I can telnet on port 9300 from all hosts to all hosts in all directions. The cluster just will not form

here is config for hosts:

> path.data: /var/lib/elasticsearch
> path.logs: /var/log/elasticsearch
> cluster.name: "wazuh"
> node.name: "st-wazuh-es01"
> #network.bind_host: 172.16.40.90
> network.host: 172.16.40.90
> network.publish_host: 172.16.40.90
> node.master: true
> node.data: true
> discovery.zen.ping.unicast.hosts: ["172.16.40.90", "172.16.40.91", "172.16.40.92"]
> discovery.zen.minimum_master_nodes: 2
> cluster.initial_master_nodes:
>  - 172.16.40.90
>  - 172.16.40.91
>  - 172.16.40.92
> 
> path.data: /var/lib/elasticsearch
> path.logs: /var/log/elasticsearch
> cluster.name: "wazuh"
> node.name: "st-wazuh-es02"
> network.bind_host: 172.16.40.91
> network.host: 172.16.40.91
> network.publish_host: 172.16.40.91
> node.master: true
> node.data: true
> discovery.zen.ping.unicast.hosts: ["172.16.40.90", "172.16.40.91", "172.16.40.92"]
> discovery.zen.minimum_master_nodes: 2
> cluster.initial_master_nodes:
>  - 172.16.40.90
>  - 172.16.40.91
>  - 172.16.40.92
> 
> 
> path.data: /var/lib/elasticsearch
> path.logs: /var/log/elasticsearch
> cluster.name: "wazuh"
> node.name: "st-wazuh-es03"
> network.bind_host: 172.16.40.92
> network.host: 172.16.40.92
> network.publish_host: 172.16.40.92
> node.master: true
> node.data: true
> discovery.zen.ping.unicast.hosts: ["172.16.40.90", "172.16.40.91", "172.16.40.92"]
> discovery.zen.minimum_master_nodes: 2
> cluster.initial_master_nodes:
>  - 172.16.40.90
>  - 172.16.40.91
>  - 172.16.40.92

when I cat /var/log/elasticsearch/elasticsearch.log - nothing showing for today even though I am bouncing the service

Here is output of every host:

> curl -XGET '172.16.40.92:9200/_cluster/health?pretty'
> {
>   "cluster_name" : "wazuh",
>   "status" : "yellow",
>   "timed_out" : false,
>   "number_of_nodes" : 1,
>   "number_of_data_nodes" : 1,
>   "active_primary_shards" : 50,
>   "active_shards" : 50,
>   "relocating_shards" : 0,
>   "initializing_shards" : 0,
>   "unassigned_shards" : 1,
>   "delayed_unassigned_shards" : 0,
>   "number_of_pending_tasks" : 0,
>   "number_of_in_flight_fetch" : 0,
>   "task_max_waiting_in_queue_millis" : 0,
>   "active_shards_percent_as_number" : 98.0392156862745
> }
> 
> curl -XGET '172.16.40.90:9200/_cluster/health?pretty'
> {
>   "cluster_name" : "wazuh",
>   "status" : "yellow",
>   "timed_out" : false,
>   "number_of_nodes" : 1,
>   "number_of_data_nodes" : 1,
>   "active_primary_shards" : 58,
>   "active_shards" : 58,
>   "relocating_shards" : 0,
>   "initializing_shards" : 0,
>   "unassigned_shards" : 1,
>   "delayed_unassigned_shards" : 0,
>   "number_of_pending_tasks" : 0,
>   "number_of_in_flight_fetch" : 0,
>   "task_max_waiting_in_queue_millis" : 0,
>   "active_shards_percent_as_number" : 98.30508474576271
> }
> 
> curl -XGET '172.16.40.91:9200/_cluster/health?pretty'                                                           {
>   "cluster_name" : "wazuh",
>   "status" : "yellow",
>   "timed_out" : false,
>   "number_of_nodes" : 1,
>   "number_of_data_nodes" : 1,
>   "active_primary_shards" : 50,
>   "active_shards" : 50,
>   "relocating_shards" : 0,
>   "initializing_shards" : 0,
>   "unassigned_shards" : 1,
>   "delayed_unassigned_shards" : 0,
>   "number_of_pending_tasks" : 0,
>   "number_of_in_flight_fetch" : 0,
>   "task_max_waiting_in_queue_millis" : 0,
>   "active_shards_percent_as_number" : 98.0392156862745
> }

and versions

> {
>   "name" : "st-wazuh-es01",
>   "cluster_name" : "wazuh",
>   "cluster_uuid" : "x9P_mXJ2Slu-aKBsYszGmA",
>   "version" : {
>     "number" : "7.6.2",
>     "build_flavor" : "default",
>     "build_type" : "rpm",
>     "build_hash" : "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
>     "build_date" : "2020-03-26T06:34:37.794943Z",
>     "build_snapshot" : false,
>     "lucene_version" : "8.4.0",
>     "minimum_wire_compatibility_version" : "6.8.0",
>     "minimum_index_compatibility_version" : "6.0.0-beta1"
>   },
>   "tagline" : "You Know, for Search"
> }
> 
> curl -XGET 'http://172.16.40.91:9200'
> {
>   "name" : "st-wazuh-es02",
>   "cluster_name" : "wazuh",
>   "cluster_uuid" : "x9P_mXJ2Slu-aKBsYszGmA",
>   "version" : {
>     "number" : "7.6.2",
>     "build_flavor" : "default",
>     "build_type" : "rpm",
>     "build_hash" : "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
>     "build_date" : "2020-03-26T06:34:37.794943Z",
>     "build_snapshot" : false,
>     "lucene_version" : "8.4.0",
>     "minimum_wire_compatibility_version" : "6.8.0",
>     "minimum_index_compatibility_version" : "6.0.0-beta1"
>   },
>   "tagline" : "You Know, for Search"
> }
> 
> curl -XGET 'http://172.16.40.92:9200'
> {
>   "name" : "st-wazuh-es03",
>   "cluster_name" : "wazuh",
>   "cluster_uuid" : "x9P_mXJ2Slu-aKBsYszGmA",
>   "version" : {
>     "number" : "7.6.2",
>     "build_flavor" : "default",
>     "build_type" : "rpm",
>     "build_hash" : "ef48eb35cf30adf4db14086e8aabd07ef6fb113f",
>     "build_date" : "2020-03-26T06:34:37.794943Z",
>     "build_snapshot" : false,
>     "lucene_version" : "8.4.0",
>     "minimum_wire_compatibility_version" : "6.8.0",
>     "minimum_index_compatibility_version" : "6.0.0-beta1"
>   },
>   "tagline" : "You Know, for Search"
> }

the one strange this is I cannot run curl against localhost - I have to use the IP address

It looks like you have formed three distinct one-node clusters, although they all have the same cluster ID which indicates that you copied the data directory. Don't do that, each node should start with an empty data directory.

I think the simplest fix is to wipe all their data directories and start again.

Also cluster.initial_master_nodes should be set to the node names, not their IP addresses.

Some other comments on config:

This is redundant, you should remove network.bind_host and network.publish_host and only set network.host.

This is deprecated, you should set discovery.seed_hosts instead.

This is deprecated and does nothing, you should remove this line.

One more thing :grin:

Yes, the log file is named after the cluster, so I think you want to look at /var/log/elasticsearch/wazuh.log.

thanks David,
Is seed host supposed to use IP or hostname

Yes, that's right:

Each address can be either an IP address or a hostname which resolves to one or more IP addresses via DNS.

OK thank you so much!!
I rm -rf * the /var/lib/elasticsearch/node folder and #2 and #3 are now in a cluster but #1 is not yet - still in his own little island
I also updated the config:

path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
cluster.name: "wazuh"
node.name: "st-wazuh-es02"
network.host: 0.0.0.0
node.master: true
node.data: true
discovery.seed_hosts:
 - 172.16.40.90
 - 172.16.40.91
 - 172.16.40.92
cluster.initial_master_nodes:
 - st-wazuh-es01
 - st-wazuh-es02
 - st-wazuh-es03

I will say they (#1) and the rest have different cluster UUID

also found this thread from you also :slight_smile:

I did initially only have host#1 - then VM cloned it twice.

Deleting the data file for #1 fixed the issue - hoping i dont have any more issues down the road due to cloning

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.