Replacing Elasticsearch Node

Hello,

I have a 3 node elasticsearch cluster running on version 7.6.2 in a container environment. Three nodes used with each container running on unique host machines.

Each node in the cluster is both a master and data node.

I want to replace the host machine for each node but not completely sure on how to handle replacing each node.

I use EBS volume on each host machine and I have a rolling update process that launches a new machine, detaches the volume from the intending host machine to be shut down and re-attaches the volume to the new node. A drain process moves the ES node from the machine to be shut down on to the new machine.

I used Autoscaling RollingUpdate pause time to delay the time it would take to rotate subsequent machines in the autoscaling group so the cluster has enough time to join the new node.

However, I am struggling with keeping the cluster stable with this process.

Do I have to first use cluster.routing.allocation.exclude._ip to exclude the node ?

Would appreciate any guidance on the order of removing and replacing nodes.

Welcome to our community! :smiley:

This should work, as reallocation is delayed for a while after a node leaves the cluster. Which means if the time it takes to spin up the new node on the new host is lower than than the default delay, the cluster should see the node "return" and just start recovery.

If that's not happening, it'd help if you could elaborate on this;

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.