Cluster nodes didn't recognize each others (ES-7.7.1)

Hi

I want to build a 5 nodes ES cluster but nodes didn't recognized each others. Could you help me to well config this cluster ?

It's like node didn't recognize themself to be the master-node (node-3 looking for master node named node-3):

 [2021-01-12T14:55:48,544][WARN ][o.e.c.c.ClusterFormationFailureHelper] [**node-3**] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [node-1, node-2, **node-3**] to bootstrap a cluster

Is that because of searching in a different port (node in 9200, searching in 9300)?

discovery will continue using [172.20.57.65:9300, 127.0.0.1:9300] from hosts providers

Why did he search for node at 9300 instead of 9200 ?

elasticsearch nodes config:

    cluster.name: MYAPP_ES_7
    node.name: node-1   # node-2 -3 -4 -5 for other nodes
    path.data: /elasticsearch/data/myapp_es7
    path.logs: /SD5/people/myapp/elasticsearch/elasticsearch-7.7.1/logs
    bootstrap.memory_lock: false
    network.host: xxxxxxxxxxxxx
    http.port: 9200    # 9201 9202 9203 9204 for other nodes
    discovery.seed_hosts: ["xxxxxxxxxxxxx"]
    cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]

Logs:

    [2021-01-12T14:55:48,544][WARN ][o.e.c.c.ClusterFormationFailureHelper] [node-3] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [node-1, node-2, node-3] to bootstrap a cluster: have discovered [{node-3}{AYBrgaSiQ0uZoQ0s4sJ6hw}{0ayIu41_QfGsPjFdRnB8Pw}{xxxxxxxxxxxxx}{172.20.57.65:9301}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}]; discovery will continue using [172.20.57.65:9300, 127.0.0.1:9300] from hosts providers and [{node-3}{AYBrgaSiQ0uZoQ0s4sJ6hw}{0ayIu41_QfGsPjFdRnB8Pw}{xxxxxxxxxxxxx}{172.20.57.65:9301}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
    [2021-01-12T14:55:48,566][WARN ][o.e.c.c.ClusterFormationFailureHelper] [node-2] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [node-1, node-2, node-3] to bootstrap a cluster: have discovered [{node-2}{dq9VCNiERJ-GjAaehzWoGQ}{HZ9kot15RBmubCOfG-B92w}{xxxxxxxxxxxxx}{172.20.57.65:9302}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}]; discovery will continue using [172.20.57.65:9300, 127.0.0.1:9300] from hosts providers and [{node-2}{dq9VCNiERJ-GjAaehzWoGQ}{HZ9kot15RBmubCOfG-B92w}{xxxxxxxxxxxxx}{172.20.57.65:9302}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
    [2021-01-12T14:55:48,731][WARN ][o.e.c.c.ClusterFormationFailureHelper] [node-1] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [node-1, node-2, node-3] to bootstrap a cluster: have discovered [{node-1}{W2m34wnPR3SiUZveBQkB_w}{740ABCRCR5KQ8c2FZVEtWg}{xxxxxxxxxxxxx}{172.20.57.65:9303}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}]; discovery will continue using [172.20.57.65:9300, 127.0.0.1:9300] from hosts providers and [{node-1}{W2m34wnPR3SiUZveBQkB_w}{740ABCRCR5KQ8c2FZVEtWg}{xxxxxxxxxxxxx}{172.20.57.65:9303}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
    [2021-01-12T14:55:49,262][WARN ][o.e.c.c.ClusterFormationFailureHelper] [node-4] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [node-1, node-2, node-3] to bootstrap a cluster: have discovered [{node-4}{nK7kqRc1SwWUkLLvtrrs-w}{Oss421FzT0qeoGSAGk7Vrw}{xxxxxxxxxxxxx}{172.20.57.65:9304}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}]; discovery will continue using [172.20.57.65:9300, 127.0.0.1:9300] from hosts providers and [{node-4}{nK7kqRc1SwWUkLLvtrrs-w}{Oss421FzT0qeoGSAGk7Vrw}{xxxxxxxxxxxxx}{172.20.57.65:9304}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
    [2021-01-12T14:55:49,487][WARN ][o.e.c.c.ClusterFormationFailureHelper] [node-5] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [node-1, node-2, node-3] to bootstrap a cluster: have discovered [{node-5}{KXvoyVvzTtmB_ZMZfv2g7A}{fQDxqin5QyOw1E6QyVrDhA}{xxxxxxxxxxxxx}{172.20.57.65:9305}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}]; discovery will continue using [172.20.57.65:9300, 127.0.0.1:9300] from hosts providers and [{node-5}{KXvoyVvzTtmB_ZMZfv2g7A}{fQDxqin5QyOw1E6QyVrDhA}{xxxxxxxxxxxxx}{172.20.57.65:9305}{dilmrt}{ml.machine_memory=211121090560, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0

Hi Guillaume,

Nodes will communicate internally using the transport protocol, which defaults to port 9300 if not specified.

I'd suggest you explicitly specify the transport.port for each node. And then specifically declare it on the cluster.initial_master_nodes and discovery.seed_hosts if it's not the default 9300. See: https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#discovery-settings

Otherwise, discovery assumes the nodes are listening on port 9300 and it might not be the case. That is, if node-1 binds the transport protocol to port 9300, and node-2 started on the same host, it would take 9301, or the next 93xx available port.

If the instances run on different servers, they can actually all run on http.port: 9200 and transport.port 9300. Though it's also a good practice to explicitly declare those anyway. Otherwise, if the port is taken, the nodes would start on the next available port, and they might not be discoverable.

E.g. if you wanted to set different ports:

Node 1

node.name: node-1
http.port: 9200
transport.port: 9300
cluster.initial_master_nodes: ["node-1:9300", "node-2:9301", "node-3:9302"]
discovery.seed_hosts: ["node-1:9300", "node-2:9301", "node-3:9302", "node-4:9303", "node-5:9304"]

Node 2

node.name: node-2
http.port: 9201
transport.port: 9301
cluster.initial_master_nodes: ["node-1:9300", "node-2:9301", "node-3:9302"]
discovery.seed_hosts: ["node-1:9300", "node-2:9301", "node-3:9302", "node-4:9303", "node-5:9304"]

You can have a look at the discovery and cluster formation settings and how the cluster if formed.

1 Like

That's not right, cluster.initial_master_nodes should be a list of node names, not address/port pairs. The setting in the OP is correct.

1 Like

@Imma Thank you for your answer. I try to change transport.port to an other port for all nodes, or by assigning one port for each node but this didn't fix my issue.

@DavidTurner

Do you mean this config looks to be correct, which imply my problem could come from server configuration issue?

I only mean that this line in your config is correct:

cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]

The rest of Imma's post is good advice, you should follow it.

What's the value of network.host in your configuration?

If you are binding to local IPs like 192.168.0.1, 192.168.0.2, etc., you need to configure those on discovery.seed_hosts. That is if the node names node-1, node-2, etc. do not resolve to the right bound IP.

If not specified, the setting defaults to _local_ and instances will only bind to localhost, not being visible from others hosts. Assuming you are running the 5 instances on different hosts, of course.

We see the special value _site_ being used a lot, as it binds to any site-local addresses on the system. Though this will depend on your case.

I hope this helps.

@Imma I fix my problem using your solution of defining one transport.port for each node and discovery.seed_hosts with host:transport.port format (instead of node.name:transport.port as you did, look like a copy/paste typo).

After that changes, I had to delete the path.data folder content in order to make the cluster build correctly.

Node 1

cluster.name: MYAPP_ES_7
node.name: node-1
network.host: xxxxxxxxxxxxx
http.port: 9201
transport.port: 9301
discovery.seed_hosts: ["xxxxxxxxxxxxx:9301, xxxxxxxxxxxxx:9302, ..."]
cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]

Node 2

cluster.name: MYAPP_ES_7
node.name: node-2
network.host: xxxxxxxxxxxxx
http.port: 9201
transport.port: 9301
discovery.seed_hosts: ["xxxxxxxxxxxxx:9301, xxxxxxxxxxxxx:9302, ..."]
cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]

Thank you !

1 Like