Cluster nodes not detecting each others (V7.2)

I've got 6 centOS7 servers with fresh ES 7.2 installed on them. I'm trying to get them all in a single cluster but somehow all none of them detects the other nodes.

All firewalld are disabled and stopped so i don't think it's a firewall issue (still possible, but unlikely). i'm able to ping and curl the other ES instances from each server to the others.

here is the current state of the elasticsearch.yml file:

# ======================== Elasticsearch Configuration =========================
#removed some lines to save chars to post topic
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: elasticsearch-cluster
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch
#
# Path to log files:
#
path.logs: /var/log/elasticsearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 10.10.10.91
#
# Set a custom port for HTTP:
#
#http.port: 9200
#transport.port: 9300
#
# For more information, consult the network module documentation.
#
#CUSTOM NETWORK CLUSTER CONFIGURATION
#https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html
#This specifies which network interface(s) a node should bind to in order to listen for incoming requests.
#network.bind_host: 10.10.10.91
#The publish host is the single interface that the node advertises to other nodes in the cluster, so that those nodes can connect to it.
#network.publish_host: 10.10.10.91
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["10.10.10.91", "10.10.10.92", "10.10.10.93", "10.10.10.94", "10.10.10.95", "10.10.10.96"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["node-1", "node-2", "node-3", "node-4"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
action.destructive_requires_name: true

The only variation between the nodes config file is the host ip.

And the result a some curl from the node-5 server:
[root@gzlaaf05 elasticsearch]# curl -XGET 10.10.10.91:9200/_cat/nodes curl: (7) Failed connect to 10.10.10.91:9200; No route to host [root@gzlaaf05 elasticsearch]# curl -XGET 10.10.10.92:9200/_cat/nodes 10.10.10.92 3 55 0 0.00 0.01 0.05 mdi * node-2 [root@gzlaaf05 elasticsearch]# curl -XGET 10.10.10.93:9200/_cat/nodes 10.10.10.93 3 55 0 0.09 0.04 0.05 mdi * node-3 [root@gzlaaf05 elasticsearch]# curl -XGET 10.10.10.94:9200/_cat/nodes 10.10.10.94 3 55 0 0.00 0.01 0.05 mdi * node-4 [root@gzlaaf05 elasticsearch]# curl -XGET 10.10.10.95:9200/_cat/nodes 10.10.10.95 3 55 0 0.00 0.01 0.05 mdi * node-5 [root@gzlaaf05 elasticsearch]# curl -XGET 10.10.10.96:9200/_cat/nodes 10.10.10.96 3 55 0 0.02 0.03 0.05 mdi * node-6

I am aware of the issue with nodes 2 to 6 being unable to curl the node 1 server (variation in firewall config probably). This is something i will resolve but i should be able to get a cluster with nodes 2 to 6 even if the node 1 server is unreachable. Is there anything in my configuration i'm not doing right? I should be good to go with just discovery.seed_hosts but somehow all nodes make a single node cluster and auto-elect themselves as master.

Can provide server log if asked, limites to 7k chars in initial post.

Any insight is welcome!

Does this note in the docs describe the situation you're in? If so, it also describes the solution.

1 Like

This is exactly the case!

I followed the steps in the note at the bottom of that page (stopped all nodes, emptied the data folders and started again) and i now have an (almost) full cluster!

[root@gzlaaf06 ~]# curl -XGET  10.10.10.92:9200/_cat/nodes?v                                                                                                                                                                                 ip          heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name                                                                                                                                                      
10.10.10.96            4          55   5    0.15    0.17     0.11 mdi       -      node-6                                                                                                                                                    
10.10.10.95            4          55   3    0.12    0.11     0.08 mdi       -      node-5                                                                                                                                                    
10.10.10.92            4          55   8    0.31    0.23     0.13 mdi       *      node-2                                                                                                                                                    
10.10.10.93            4          55   5    0.14    0.14     0.10 mdi       -      node-3                                                                                                                                                    
10.10.10.94            4          55   4    0.17    0.17     0.11 mdi       -      node-4 

Now i still have to figure out why my 1st server is unreachable, but i doubt this is not elastic-related!

Thanks a lot for the quick feedback!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.