When we reduce the number of nodes in our cluster to half, a lot of recoveries start in the cluster as a result of excluding the old ips. We are running a 120 node cluster with a total of 4M docs/second. There are 60 primaries and 60 replicas in the latest index. During recovery, if rollover happens- new indices start as yellow and remain yellow for a long time until rolled-over. The new rolled over index comes up as yellow again till recoveries complete. I also noticed that the shards remain unassigned for 5 minutes. Allocation explain API shows that recoveries limit on the node is breached. I was wondering if we can have a different throttling limit for newly created indices as the primaries are empty and replicas can be started quite fast with minimal network bandwidth as opposed to other shards that are being moved. The primaries come up quite fast as they rely on a different throttling limit- node_initial_primaries_recoveries.
This is impacting the availability of our new indices and any node going down will make the cluster red during scale-down.
We even evaluated waiting for index to become green before alias switch, but it takes close to a few minutes before the existing recoveries finish and replicas are assigned and will also depend on the network bandwidth and existing concurrent recoveries. It makes client side handling of the rollover difficult.
How do you suggest solving this problem?