Unable to setup Elasticsearch cluster mode

Hello there,

I am trying to deploy Elasticsearch 7.10.2 in HA with three nodes, all of which play the roles: master, ingest and data.

In order to do so, I am using the following configuration in each of the nodes:
NODE 1:

cluster.name: demo
node.name: node1
node.data: true
node.master: true
node.ingest: true
node.max_local_storage_nodes: 3
transport.tcp.port: 9300
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: false
network.host: 172.21.0.24
http.port: 9200
discovery.seed_hosts: ["172.21.0.24", "172.21.0.25","172.21.0.23"]
cluster.initial_master_nodes: ["172.21.0.24","172.21.0.25","172.21.0.23"]
discovery.zen.minimum_master_nodes: 2

NODE 2:

cluster.name: demo
node.name: node2
node.data: true
node.master: true
node.ingest: true
node.max_local_storage_nodes: 3
transport.tcp.port: 9300
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: false
network.host: 172.21.0.23
http.port: 9200
discovery.seed_hosts: ["172.21.0.24", "172.21.0.25","172.21.0.23"]
cluster.initial_master_nodes: ["172.21.0.24","172.21.0.25","172.21.0.23"]
discovery.zen.minimum_master_nodes: 2

NODE 3:

cluster.name: demo
node.name: node3
node.data: true
node.master: true
node.ingest: true
node.max_local_storage_nodes: 3
transport.tcp.port: 9300
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: false
network.host: 172.21.0.25
http.port: 9200
discovery.seed_hosts: ["172.21.0.24", "172.21.0.25","172.21.0.23"]
cluster.initial_master_nodes: ["172.21.0.24","172.21.0.25","172.21.0.23"]
discovery.zen.minimum_master_nodes: 2

When I curl one of the nodes to see what nodes belong to the cluster, there seems to be one node not being added:

[root@bastion elastic-playbooks]# ansible elastic -m shell -a "curl http://{{ inventory_hostname }}:9200/_cat/nodes? --silent"
172.21.0.23 | CHANGED | rc=0 >>
172.21.0.23 16 9 0 0.00 0.01 0.05 cdhilmrstw * node2
172.21.0.24 | CHANGED | rc=0 >>
172.21.0.25 41 9 0 0.01 0.09 0.08 cdhilmrstw - node3
172.21.0.24 39 9 0 0.01 0.06 0.06 cdhilmrstw * node1
172.21.0.25 | CHANGED | rc=0 >>
172.21.0.23 17 9 0 0.00 0.01 0.05 cdhilmrstw - node2
172.21.0.25 44 9 0 0.01 0.08 0.08 cdhilmrstw * node3

How can this be happening? Am I missing some important option?

Another doubt is that the cluster_uuid is different for each of the three nodes:

[root@bastion elastic-playbooks]# ansible elastic -m shell -a "curl http://{{ inventory_hostname }}:9200 --silent"

172.21.0.25 | CHANGED | rc=0 >>
{
  "name" : "node3",
  "cluster_name" : "demo",
  "cluster_uuid" : "BbwIsWoQTxGrhGoiSdPgLQ",
  "version" : {
    "number" : "7.10.2",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
    "build_date" : "2021-01-13T00:42:12.435326Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
172.21.0.23 | CHANGED | rc=0 >>
{
  "name" : "node2",
  "cluster_name" : "demo",
  "cluster_uuid" : "TJWYop8pR8mvlga4dVmf2g",
  "version" : {
    "number" : "7.10.2",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
    "build_date" : "2021-01-13T00:42:12.435326Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
172.21.0.24 | CHANGED | rc=0 >>
{
  "name" : "node1",
  "cluster_name" : "demo",
  "cluster_uuid" : "txrgSqiYRp-Oh-EtrZnWkw",
  "version" : {
    "number" : "7.10.2",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
    "build_date" : "2021-01-13T00:42:12.435326Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Thanks in advance,

Try this, The node names used in the cluster.initial_master_nodes list must exactly match the node.name

cluster.initial_master_nodes: ["node1", "node2", "node3"]

Hello Yassine,

First of all thank you for your quick response! :slightly_smiling_face:

I modified the configuration, now it looks like the following:

[root@bastion elastic-playbooks]# ansible elastic -m shell -a 'cat /etc/elasticsearch/elasticsearch.yml' 
172.21.0.25 | CHANGED | rc=0 >>
cluster.name: demo
node.name: es3
node.data: true
node.master: true
node.ingest: true
#node.max_local_storage_nodes: 3
transport.tcp.port: 9300
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: false
network.host: 172.21.0.25
http.port: 9200
discovery.seed_hosts: ["172.21.0.23", "172.21.0.24","172.21.0.25"]
cluster.initial_master_nodes: ["es1", "es2","es3"]
discovery.zen.minimum_master_nodes: 2
172.21.0.24 | CHANGED | rc=0 >>
cluster.name: demo
node.name: es2
node.data: true
node.master: true
node.ingest: true
#node.max_local_storage_nodes: 3
transport.tcp.port: 9300
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: false
network.host: 172.21.0.24
http.port: 9200
discovery.seed_hosts: ["172.21.0.23", "172.21.0.24","172.21.0.25"]
cluster.initial_master_nodes: ["es1", "es2","es3"]
discovery.zen.minimum_master_nodes: 2
172.21.0.23 | CHANGED | rc=0 >>
cluster.name: demo
node.name: es1
node.data: true
node.master: true
node.ingest: true
#node.max_local_storage_nodes: 3
transport.tcp.port: 9300
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
bootstrap.memory_lock: false
network.host: 172.21.0.23
http.port: 9200
discovery.seed_hosts: ["172.21.0.23", "172.21.0.24","172.21.0.25"]
cluster.initial_master_nodes: ["es1", "es2","es3"]
discovery.zen.minimum_master_nodes: 2

The problem is the following:

[root@bastion elastic-playbooks]# ansible elastic -m shell -a "curl http://{{ inventory_hostname }}:9200/_cat/nodes? --silent"
172.21.0.24 | CHANGED | rc=0 >>
172.21.0.23  7 9 0 0.06 0.03 0.05 cdhilmrstw - es1
172.21.0.24 59 9 0 0.00 0.03 0.05 cdhilmrstw * es2
172.21.0.25  8 9 0 0.00 0.01 0.05 cdhilmrstw - es3
172.21.0.25 | CHANGED | rc=0 >>
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
172.21.0.23 | CHANGED | rc=0 >>
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

How is it possible that the rest of the nodes: es2 and es3 cannot discover the masters? The port 9300 is accessible because there is no firewall, I already checked.

transport.port or transport.tcp.port ?

Hello,

I have tried with the option transport.port and the result is the following:

[root@bastion elastic-playbooks]# ansible elastic -m shell -a "curl http://{{ inventory_hostname }}:9200/_cat/nodes? --silent"
172.21.0.24 | CHANGED | rc=0 >>
172.21.0.24 27 9 1 0.06 0.07 0.06 cdhilmrstw * es2
172.21.0.25 | CHANGED | rc=0 >>
172.21.0.25 26 9 1 0.14 0.09 0.07 cdhilmrstw * es3
172.21.0.23 | CHANGED | rc=0 >>
172.21.0.23 25 9 1 0.05 0.06 0.06 cdhilmrstw * es1

After trying setting both options the result was the same.

Do i have to reboot the Elasticsearch nodes one by one?

Thanks for your help,

Hello, I got it working by bootstrapping the nodes one by one:

There is this article that enlighted me to do so:

I removed the option #discovery.zen.minimum_master_nodes as it seems to be deprecated.

And had to remove the directory /var/lib/elasticsearch/nodes, node by node.

Hope this helps to anyone running into this issue.

Thanks for your help @ylasri

1 Like