Undo decommissioning of a node

On another much smaller 1.7 cluster, I was able to exclude a node, watch it drain, set exclude._name to "", and then see that shards immediately got relocated to that node. So that is apparently not the problem.

On the other two larger clusters (400 indices, 3000 shards, 12 nodes), I think there's just something else going on there, possibly due to the forced awareness of AZ's and/or possibly due to the sheer size of the cluster in terms of shard count. Like, even with 12 data nodes and everything pretty balanced AFAICT, one cluster is still going nuts relocating/replicating shards. And it's not all in one direction either. Sometimes a node gets a few hundred GBs of data only to have another few hundred GBs moved away immediately after. It's unclear to me how the rebalancing is getting determined, or how long it will take to quiesce.

At any rate, this seems to be an issue around rebalancing heuristics and not shard allocation filtering.