Different Cluster Behavior When Two Master-Eligible Nodes Are Stopped

Farhad_Salehi · August 5, 2025, 10:07am

I’m working with two different Elasticsearch clusters (version 8.17.5) and noticed different behavior when stopping master-eligible nodes. Here's the setup for both clusters:
Cluster A:
3 dedicated master nodes
3 dedicated data nodes
Cluster B:
3 dedicated master nodes
3 dedicated data nodes

In both clusters, I stop **2 out of 3 master-eligible nodes.

On Cluster A, the cluster remains functional — no issues are reported, and the remaining master-eligible node keeps the cluster running.
On Cluster B, the cluster becomes unavailable, with messages indicating the absence of a master.

This behavior confuses me because both clusters have the same architecture. According to quorum rules, I’d expect both clusters to require at least 2 master-eligible nodes to maintain quorum and elect a master.

My Questions:

Why does Cluster A still function with only 1 master-eligible node running?
Could this be due to differences in voting configuration,cluster state, or leftover voting exclusions?
What’s the proper way to inspect and compare the master election configuration and coordination state between these clusters?
Is there a recommended way to ensure consistent and predictable master election behavior across multiple clusters?

Christian_Dahlqvist · August 5, 2025, 10:15am

If that is the case I suspect your description of the cluster topology is incorrect. What is the output of the cat nodes API?

Farhad_Salehi · August 5, 2025, 10:57am

i check it and why this happened.

master 1 is master
master 2 changed node roles to cordinator and then restart it.
master 3 changed node roles to cordinator and then restart it.

cluster work correctly and when we restart master 1 nothing happen and work as well.

Christian_Dahlqvist · August 5, 2025, 11:05am

I am not sure I follow. Please provide the exact sequence of steps you performed and the output of the API I linked to, ideally at each stage if you can reproduce it.

Farhad_Salehi · August 5, 2025, 11:43am

10.58.5.3 16 89 0 0.08 0.08 0.09 - - as-se-stg-ees-master-3
10.58.5.2 14 89 0 0.06 0.06 0.08 - - as-se-stg-ees-master-2
10.58.5.1 60 98 3 0.66 0.21 0.12 m * as-se-stg-ees-master-1
10.58.5.6 63 98 6 0.32 0.31 0.36 d - as-se-stg-ees-data-3
10.58.5.4 56 95 17 0.48 0.42 0.37 d - as-se-stg-ees-data-1
10.58.5.5 17 94 8 0.18 0.31 0.34 d - as-se-stg-ees-data-2

When we have 3 master-eligible nodes and we stop two of them at once, the cluster becomes unavailable, which is expected due to lack of quorum.

However, if we instead gradually change the role of two master nodes one by one to coordinator-only nodes and restart them, the cluster continues to function with only one master node remaining.

So in this case, even though we end up with only one master-eligible node, the cluster still works — unlike the first scenario where stopping two masters caused the cluster to become unavailable.

RainTown · August 5, 2025, 11:49am

Why do you think this is not correct?

why do you think this is incorrect?

A 6-node cluster, with 1-master-eligible node, is a valid topology. It might not be wise, but it is valid.

Christian_Dahlqvist · August 5, 2025, 11:51am

That is all expected as you reconfigured the cluster before shutting down the nodes that used to be master eligible. I do not understand what the problem is.

Farhad_Salehi · August 5, 2025, 12:20pm

I thought that if a cluster is initialized with 3 master-eligible nodes, it would always require at least 2 of them to be available in order to stay functional.

Christian_Dahlqvist · August 5, 2025, 12:29pm

You changed the cluster to only have one master eligible node, which means it at that point would behave exactly as if it was initially configured that way.

Farhad_Salehi · August 5, 2025, 1:28pm

tnx for your reply

DavidTurner · August 5, 2025, 2:32pm

This is the key difference, you're bringing the nodes back into the cluster again so Elasticsearch can tell that they're no longer master-eligible, which means it's safe to reconfigure the cluster to ignore their votes. If you just shut them down and don't start them up again then it cannot do that.

Topic		Replies	Views
Need advice to understand cluster behavior Elasticsearch	4	469	September 26, 2018
Stop all nodes while stopping master node Elasticsearch	8	522	May 20, 2020
Elastic Search restart Elasticsearch	10	972	March 28, 2017
Master election problem in 3 node cluster when one died Elasticsearch	7	2612	July 5, 2017
Elasticsearch master node replacment Elasticsearch	10	568	August 21, 2019

Different Cluster Behavior When Two Master-Eligible Nodes Are Stopped

Related topics