New master is not getting selected

I have created a cluster of 3 nodes, out of which 2 are data and master and 1 is only data node. Recently due to some technical glitch one of the master nodes was down, but through out we kept getting a log msg

elastic-server.json

{"type": "server", "timestamp": "2021-10-06T00:00:04,694-07:00", "level": "WARN", "component": "r.suppressed", "cluster.name": "elasticsearch-ME-Q", "node.name": "es_node_02_domain2", "message": "path: /_monitoring/bulk, params: {system_id=kibana, system_api_version=7, interval=10000ms}", "cluster.uuid": "j0LUZ******T0Sww", "node.id": "wm3gU*****2JHZEg" , 
"stacktrace": ["org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];",
"at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:189) ~[elasticsearch-7.9.3.jar:7.9.3]"

log:

[2021-10-06T04:18:53,189][WARN ][o.e.c.c.ClusterFormationFailureHelper] [es_node_03_domain3] master not discovered yet: have discovered [{es_node_03_domain3}{Ngfx****femrA}{GS5E******HzQhA}{domain3.local}{<ip>:9300}{d}{xpack.installed=true, transform.node=false}, {es_node_02_domain2}{wm3g*****GS2JHZEg}{RyYK*****EFYDCjw}{domain2.local}{<ip>:9300}{dm}{xpack.installed=true, transform.node=false}, {es_node_01_domain1}{FxKq6C****IQCQ}{4RO****C0x9Q}{domain1.local}{<ip>:9300}{dilmrt}{ml.machine_memory=12884295680, ml.max_open_jobs=20, xpack.installed=true, transform.node=true}]; discovery will continue using [<ip>:9300, <ip>:9300] from hosts providers and [{es_node_02_domain2}{wm****GS2JHZEg}{RyY****EFYDCjw}{domain2.local}{<ip>:9300}{dm}{xpack.installed=true, transform.node=false}, {es_node_01_domain1}{FxKq6****xIQCQ}{4ROd****C0x9Q}{domain1.mefldc.local}{<ip>:9300}{dilmrt}{ml.machine_memory=12884295680, ml.max_open_jobs=20, xpack.installed=true, transform.node=true}] from last-known cluster state; node term 5467, last-accepted version 24861 in term 5467
[2021-10-06T04:18:57,487][WARN ][r.suppressed             ] [es_node_03_domain3] path: /_monitoring/bulk, params: {system_id=kibana, system_api_version=7, interval=10000ms}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];
	at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:189) ~[elasticsearch-7.9.3.jar:7.9.3]

config:

cluster.name: elasticsearch-ME-Q
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: es_node_01_domain1
#
# Define roles for the node from [master, data, ingest, ml, remote_custer_client, transform, voting-only]
# refer: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html
#node.roles: [ master , data]
node.master: true
node.data: true
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#

#Snapshot repository path
path.repo: D:\elasticsearch-backups
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.memory_lock: true

# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: domain1.local
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
#node.local: true #disable network
# --------------------------------- Discovery ----------------------------------
#
#discovery.type: single-node
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["domain1.local", "domain2.local", "domain3.local"]

cluster.initial_master_nodes: ["es_node_01_domain1", "es_node_02_domain2"]
#
#to avoid split brain ([Master Eligible Node) / 2 + 1])
discovery.zen.minimum_master_nodes: 2

# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
gateway.recover_after_time: 5m

the exact same setting is on all the three nodes except the role and the names.

Any help on why the second node was not elected as master?

In order to elect a master node a strict majority of master eligible nodes need to be available. In order to allow for one master eligible node to be offline you need a minimum of 3 master eligible nodes in the cluster as the strict majority of 2 is 2.

so you mean all the nodes will have to be node.roles: master, data ?
or the third node has to be master-eligible?

I would recommend making all nodes master and data.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.