How to fix cluster is red, reason: no_valid_shard_copy after stopping data nodes?

After stopping some data nodes, my cluster becomes red and the reason is: no_valid_shard_copy after stopping data nodes?

How can I fix this?

Update: should I use reroute?

POST _cluster/reroute?retry_failed
  1. How many data nodes do you have ?
  2. Go to index management --> locate indices with Health = Red . What is the value of 'Replicas' ?
  1. I removed 3 data nodes and got the issue
  2. value of replica is 0

If you have no replica shards you need to vacate all indices from the nodes yoiu are to remove before actually removing them. If you do not, a number of primary shards will no longer be available to the cluster, resulting in a red state and lost data.

To fix this youi need to bring the nodes and the data back before following the process described in the docs I linked to.

1 Like

Thank you a lot!

So I need to start my data nodes (those I turned off)

And then update settings:

PUT _cluster/settings
{
  "transient" : {
    "cluster.routing.allocation.exclude._ip" : "10.0.0.1"
  }
}

Right?

How long you need to wait will depend on how much data and how many shards you have. If you enabled a replica you could remove one node at a time as long as you allow Elasticsearch to rebalanced after each one.

1 Like

thanks, it works

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.