Hi,
Had six nodes (all master eligible) running in our cluster and then three nodes was removed, volumes gone. The three nodes that was removed were excluded from shard allocation with setting:
"cluster.routing.allocation.exclude._ip"
and had only one shard per node, belonging to the _security index.
After the nodes was removed this failure arose (from log):
elasticsearch security index is unavailable short circuiting retrieval of user
Settings for cluster discovery were as follows (legacy from v6):
discovery.zen.ping.unicast.hosts: 2,
discovery.zen.minimum_master_nodes: "192.168.50.80:9300, 192.168.50.81:9300, 192.168.50.83:9300"
My issue now is that the cluster wont elect a master since the requirements aren't fulfilled, from log on node es-1:
[<time..>][WARN ][o.e.c.c.ClusterFormationFailureHelper] [es-1] master not discovered or elected yet, an election requires at least 3 nodes with ids from [2IE4RVpNTfKpL5JQsbvPCQ, 4womI-u8TMS_lxDytQ0kGg, 8DDEaYnnQKGkK9ea2klnnw, DuIW_Q65QFeZWOzwcyKlXA, TxpUfaPDTrSCngTNuZ_Brg] and at least 2 nodes with ids from [4womI-u8TMS_lxDytQ0kGg, 8DDEaYnnQKGkK9ea2klnnw, TxpUfaPDTrSCngTNuZ_Brg], have discovered [{es-2}{TxpUfaPDTrSCngTNuZ_Brg}{X0wUWTMrTLm1tDE2JMHZ2A}{192.168.50.81}{192.168.50.81:9300}{xpack.installed=true}, {es-4}{8DDEaYnnQKGkK9ea2klnnw}{EtBJ1BycTBCTIANy3Pp_MA}{192.168.50.83}{192.168.50.83:9300}{xpack.installed=true}] which is not a quorum; discovery will continue using [192.168.50.81:9300, 192.168.50.83:9300] from hosts providers and [{es-1}{Vmau018eQWO3AjMzSuo8sQ}{3Ri3q7cvQEGV-p11ykEkCA}{192.168.50.80}{192.168.50.80:9300}{xpack.installed=true}] from last-known cluster state; node term 1087, last-accepted version 125480 in term 11
Same for node es-2:
[<time...>][WARN ][o.e.c.c.ClusterFormationFailureHelper] [es-2] master not discovered or elected yet, an election requires at least 3 nodes with ids from [2IE4RVpNTfKpL5JQsbvPCQ, 4womI-u8TMS_lxDytQ0kGg, 8DDEaYnnQKGkK9ea2klnnw, DuIW_Q65QFeZWOzwcyKlXA, TxpUfaPDTrSCngTNuZ_Brg] and at least 2 nodes with ids from [4womI-u8TMS_lxDytQ0kGg, 8DDEaYnnQKGkK9ea2klnnw, TxpUfaPDTrSCngTNuZ_Brg], have discovered [{es-1}{Vmau018eQWO3AjMzSuo8sQ}{3Ri3q7cvQEGV-p11ykEkCA}{192.168.50.80}{192.168.50.80:9300}{xpack.installed=true}, {es-4}{8DDEaYnnQKGkK9ea2klnnw}{EtBJ1BycTBCTIANy3Pp_MA}{192.168.50.83}{192.168.50.83:9300}{xpack.installed=true}] which is not a quorum; discovery will continue using [192.168.50.80:9300, 192.168.50.83:9300] from hosts providers and [{es-2}{TxpUfaPDTrSCngTNuZ_Brg}{X0wUWTMrTLm1tDE2JMHZ2A}{192.168.50.81}{192.168.50.81:9300}{xpack.installed=true}] from last-known cluster state; node term 1087, last-accepted version 125480 in term 11
and es-4:
[<time..>][WARN ][o.e.c.c.ClusterFormationFailureHelper] [es-4] master not discovered or elected yet, an election requires at least 3 nodes with ids from [2IE4RVpNTfKpL5JQsbvPCQ, 4womI-u8TMS_lxDytQ0kGg, 8DDEaYnnQKGkK9ea2klnnw, DuIW_Q65QFeZWOzwcyKlXA, TxpUfaPDTrSCngTNuZ_Brg] and at least 2 nodes with ids from [4womI-u8TMS_lxDytQ0kGg, 8DDEaYnnQKGkK9ea2klnnw, TxpUfaPDTrSCngTNuZ_Brg], have discovered [{es-1}{Vmau018eQWO3AjMzSuo8sQ}{3Ri3q7cvQEGV-p11ykEkCA}{192.168.50.80}{192.168.50.80:9300}{xpack.installed=true}, {es-2}{TxpUfaPDTrSCngTNuZ_Brg}{X0wUWTMrTLm1tDE2JMHZ2A}{192.168.50.81}{192.168.50.81:9300}{xpack.installed=true}] which is not a quorum; discovery will continue using [192.168.50.80:9300, 192.168.50.81:9300] from hosts providers and [{es-4}{8DDEaYnnQKGkK9ea2klnnw}{EtBJ1BycTBCTIANy3Pp_MA}{192.168.50.83}{192.168.50.83:9300}{xpack.installed=true}] from last-known cluster state; node term 1087, last-accepted version 125480 in term 11
I don't understand why election requires three nodes, is the version 6 minimum_master_nodes
setting not used at all?
If so, was requirement of three nodes enforced when cluster had six nodes?
I have tried adding setting
cluster.initial_master_nodes: [es-1, es-2]
But that doesn't make a difference.
Is it possible to reset cluster formation settings/requirements?