Safely repair (remove/add-back) a data node in cluster

David_Copperfield · January 1, 2022, 7:12am

we have an Elasticsearch cluster running with a few data nodes. Recently one of the data nodes have some disk issues and we'll fix it.

To preserve indexes, I've excluded the faulty data node out from cluster, with

curl -XPUT P.P.P.P:9200/_cluster/settings -H 'Content-Type: application/json' -d '{
  "transient" :{
      "cluster.routing.allocation.exclude._ip" : "X.X.X.X"
   }
}';echo

refers to the link How to remove node from elasticsearch cluster on runtime without down time - Stack Overflow.

The I shutdown the data node, fixed the disk, and power it on.

The problem is that, after the machine is on, it fails to add back into cluster.

Is it because of the above transient excluding setting? I tried similar setting, just replace exclude with include but had no effect.

If there a way to disable/delete the transient setting above, thanks

DavidTurner · January 1, 2022, 5:29pm

This could mean a number of different things but it's hard to help without knowing which it is. What does this mean exactly? Any relevant logs? Errors? API outputs?

system · January 29, 2022, 5:29pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Return node to cluster Elasticsearch	2	638	January 23, 2021
How to add back a node after excluding it from cluster Elasticsearch	3	2100	July 23, 2017
Elasticsearch cluster automatically adds transient settings...How do I remove this? Kibana	5	287	March 16, 2023
Why does cluster.routing.allocation.exclude._ip only work as a transient, not persistent setting? Elasticsearch	4	1090	November 2, 2023
Remove nodes from ELK cluster Elasticsearch	3	386	January 10, 2019

Safely repair (remove/add-back) a data node in cluster

Related topics