I have created a cluster of 3 nodes, out of which 2 are data and master and 1 is only data node. Recently due to some technical glitch one of the master nodes was down, but through out we kept getting a log msg
elastic-server.json
{"type": "server", "timestamp": "2021-10-06T00:00:04,694-07:00", "level": "WARN", "component": "r.suppressed", "cluster.name": "elasticsearch-ME-Q", "node.name": "es_node_02_domain2", "message": "path: /_monitoring/bulk, params: {system_id=kibana, system_api_version=7, interval=10000ms}", "cluster.uuid": "j0LUZ******T0Sww", "node.id": "wm3gU*****2JHZEg" ,
"stacktrace": ["org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];",
"at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:189) ~[elasticsearch-7.9.3.jar:7.9.3]"
log:
[2021-10-06T04:18:53,189][WARN ][o.e.c.c.ClusterFormationFailureHelper] [es_node_03_domain3] master not discovered yet: have discovered [{es_node_03_domain3}{Ngfx****femrA}{GS5E******HzQhA}{domain3.local}{<ip>:9300}{d}{xpack.installed=true, transform.node=false}, {es_node_02_domain2}{wm3g*****GS2JHZEg}{RyYK*****EFYDCjw}{domain2.local}{<ip>:9300}{dm}{xpack.installed=true, transform.node=false}, {es_node_01_domain1}{FxKq6C****IQCQ}{4RO****C0x9Q}{domain1.local}{<ip>:9300}{dilmrt}{ml.machine_memory=12884295680, ml.max_open_jobs=20, xpack.installed=true, transform.node=true}]; discovery will continue using [<ip>:9300, <ip>:9300] from hosts providers and [{es_node_02_domain2}{wm****GS2JHZEg}{RyY****EFYDCjw}{domain2.local}{<ip>:9300}{dm}{xpack.installed=true, transform.node=false}, {es_node_01_domain1}{FxKq6****xIQCQ}{4ROd****C0x9Q}{domain1.mefldc.local}{<ip>:9300}{dilmrt}{ml.machine_memory=12884295680, ml.max_open_jobs=20, xpack.installed=true, transform.node=true}] from last-known cluster state; node term 5467, last-accepted version 24861 in term 5467
[2021-10-06T04:18:57,487][WARN ][r.suppressed ] [es_node_03_domain3] path: /_monitoring/bulk, params: {system_id=kibana, system_api_version=7, interval=10000ms}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/2/no master];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:189) ~[elasticsearch-7.9.3.jar:7.9.3]
config:
cluster.name: elasticsearch-ME-Q
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: es_node_01_domain1
#
# Define roles for the node from [master, data, ingest, ml, remote_custer_client, transform, voting-only]
# refer: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html
#node.roles: [ master , data]
node.master: true
node.data: true
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
#Snapshot repository path
path.repo: D:\elasticsearch-backups
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.memory_lock: true
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: domain1.local
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
#node.local: true #disable network
# --------------------------------- Discovery ----------------------------------
#
#discovery.type: single-node
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["domain1.local", "domain2.local", "domain3.local"]
cluster.initial_master_nodes: ["es_node_01_domain1", "es_node_02_domain2"]
#
#to avoid split brain ([Master Eligible Node) / 2 + 1])
discovery.zen.minimum_master_nodes: 2
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
gateway.recover_after_time: 5m
the exact same setting is on all the three nodes except the role and the names.
Any help on why the second node was not elected as master?