We have a hot/warm data cluster setup where data transfers from our hot nodes to our warm nodes after a set number of days. The disks on the warm nodes got too full and a few reached the low watermark which caused allocations to start occurring. We upgraded the size of the disks once about half of the nodes in the warm cluster started to reach the low watermark.
But since this issue started two days ago we've constantly had 9 warm shard allocations occurring as our cluster.routing.allocation.cluster_concurrent_rebalance setting is set to 9. Increasing the disk space doesn't seem like it's stopped the constant allocations, and there doesn't seem to be a list of pending allocations that we can clear. Is there a way to check these, or clear this list? When I check the /_cat/allocations api it is aware of the larger disk size.
The hot nodes have between 50-90% disk used and are balanced fine, and are not currently allocating shards. Our warm nodes are closer to 80-85% full and appear to be correctly balanced. It's also now taking shards from the warm nodes and putting them on other warm nodes with less disk space available. We're using ES 6.8.
We're using these cluster.routing settings -
cluster.routing.allocation.cluster_concurrent_rebalance = 9 cluster.routing.allocation.disk.threshold_enabled = true cluster.routing.allocation.watermark.low = 0.93 cluster.routing.allocation.watermark.flood_stage = 0.97 cluster.routing.allocation.watermark.high = 0.93 cluster.routing.allocation.balance.index = .01f cluster.routing.allocation.balance.shard = .01f cluster.routing.allocation.enable = all cluster.routing.allocation.awareness.attributes = aws_availability_zone