Rolling upgrades, master nodes & voting_config_exclusions

Hi,

We are planning to upgrade our v7.8 cluster to v7.17, in a rolling upgrade.

The cluster has 22 nodes and all nodes have both Master and Data roles.

We're planning to leave the currently elected master node till last, but we are concerned that if that node were to fail for any reason during the rolling upgrade, a newly upgraded v7.17 node could become the master and thereby cause issues where the not-yet-upgraded 7.8 nodes might not be able to join the cluster.

So my main questions are:

  1. Is this something we actually need to worry about? (e.g. I was wondering if software version is considered in an election scenario with a mixed version cluster, with preference to the master-eligible nodes with the lowest version. If not - suggested enhancement! :slight_smile: )

  2. If this concern is real, would adding nodes to the "voting_config_exclusions" just before they are upgraded be a good way to resolve? At least for the first 10 or so nodes in the upgrade process, just to reduce risk...

Any other comments/suggestions welcome, thx!

Not really, at least not in the way you describe. If a v7.17 master is elected and sees that there are still v7.8 nodes in the cluster then it will remain in a v7.8-compatible mode. See these docs for more information.

You will only run into problems if the cluster ends up containing only v7.17 nodes, perhaps due to a network partition that drops all the remaining v7.8 nodes. That becomes a risk once you've upgraded more than half of the master-eligible nodes, and the simplest way to avoid problems in this area is to follow this guidance, particularly:

However, it is good practice to limit the number of master-eligible nodes in the cluster to three.

Hi @DavidTurner thanks a mill for the quick answer, appreciate it.

Also, aware our cluster config is not exactly optimal, for various reasons! We'll aim to take a look at dedicated node roles/reducing master set, after we complete this upgrade!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.