Elasticsearch hosts upgrade - options

zaeemmasood · April 28, 2023, 3:36pm

Hello,

We have 3 Master nodes and 5 data nodes in our cluster.

The VMs hosting the entire cluster is undergoing an upgrade which will involve downtime (for few hours) and also IP change for the 3 Master nodes and 1 data node.

We have a choice of getting this upgrade done in piecemeal and are thinking of getting the whole exercise done in 2 phases.

What would be the best option to pick when it comes to selecting the nodes?

I was thinking of the following:

1st Phase

1 Master node and 3 data nodes

2nd Phase

2 Master nodes and 2 data nodes

Please guide.

Thanks

leandrojmp · April 28, 2023, 3:47pm

I would say that you will probably need to do this in more steps since the downtime can take hours.

The safest option is to exclude the data node from allocation before shutting it down.

For example:

PUT _cluster/settings
{
  "persistent" : {
    "cluster.routing.allocation.exclude._name" : "node-name"
  }
}

After you run that request, the shards will start to move to the other nodes, after the node is empty you can shut it down.

When the upgrade is finished and the node is up again you need to clear the allocation settings to allow the node to receive shards again.

PUT _cluster/settings
{
  "persistent" : {
    "cluster.routing.allocation.exclude._name" : null
  }
}

When the shards are reallocated, you may repeate the process for the other data nodes.

If your master nodes are master-only, you do not need to exclude them from allocation and can just shutdown the node but you need to have at least two master nodes up at the same time.

zaeemmasood · April 28, 2023, 7:53pm

thanks a lot! @leandrojmp

Can the "exclude data node" command take 2 nodes at a time (comma separated) as below:

PUT _cluster/settings
{
  "persistent" : {
    "cluster.routing.allocation.exclude._name" : ["node1-name", "node2-name"]
  }
}

leandrojmp · April 28, 2023, 9:21pm

Yes, but it is not an array, it is just comma separated.

"node1-name, node2-name"

Also, to do that you need to make sure that the remaining nodes have enough free space to receive the data from the two nodes.

zaeemmasood · April 28, 2023, 9:23pm

thank you!

I am trying this on one node in a lower environment and it is running for almost 2.5 hours now and still ~40% shards remain to be moved over

leandrojmp · April 28, 2023, 10:44pm

Yeah, it is normal, it can take time.

zaeemmasood · May 1, 2023, 1:29pm

Thanks. It took 8 hours for 1512 shards to migrate from a node that had both roles (Master and data).

Would it take the same/ similar time if I choose to do 2 hosts/ nodes together; considering i have disk space available? These 2 nodes are dedicated data nodes.

Christian_Dahlqvist · May 1, 2023, 1:36pm

How many shards do you have in the cluster? What is the average shard size?

zaeemmasood · May 1, 2023, 11:42pm

Hello Christian. 1465 Shards with 1.3GB size on the average

I calculated by executing the following and taking the average of the result. Hope this is the right way:

GET _cat/shards?v=true&h=index,prirep,shard,store&s=prirep,store&bytes=gb&s=store:desc

Thanks

system · May 29, 2023, 11:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Shrinking a large cluster Elasticsearch	1	332	September 25, 2019
Fully automated upgrading of Elasticsearch cluster Elasticsearch	4	901	September 2, 2019
Temporary downtime during node migration (and order) Elasticsearch	15	1050	June 9, 2020
Update elasticsearch cluster hardware with no downtime Elasticsearch	3	1407	July 5, 2017
Elasticseach upgrade nodes Elasticsearch	1	437	July 5, 2017

Elasticsearch hosts upgrade - options

Related topics