Elasticsearch: 9.0.3 (ECK managed)
-
Kubernetes: AKS
-
Topology: 2 hot data nodes, 2 warm data nodes, 3 master nodes
-
Storage: persistent volumes per data node
-
Workload: APM traces (plus APM logs/metrics); data streams with rollover (~5 GB / ~8 h)
-
ILM: hot → warm (after ~10 days), then delete after 180 days
-
Replicas: currently 0 on hot; set to 0 on warm temporarily to reduce pressure
After setting replicas=0 to stabilize, one warm node’s disk keeps getting much fuller than the other:
es-warm-0 disk.total - 393.1gb, disk.used - 343.2gb, disk.avail. - 49.9gb
es-warm-1 disk.total - 393.1gb, disk.used - 298.9gb, disk.avail. - 94.1gb
What’s the recommended way to make the allocator prioritize disk usage so warm nodes converge on similar free space?
How I resolved it
I managed to balance disk usage across my nodes without overloading the JVM or hitting circuit breakers. Here’s what I did step by step:
- Throttle relocations first (avoid heap spikes/circuit breakers during moves)
PUT /_cluster/settings
{
"transient": {
"cluster.routing.allocation.cluster_concurrent_rebalance": "1",
"cluster.routing.allocation.node_concurrent_incoming_recoveries": "1",
"cluster.routing.allocation.node_concurrent_outgoing_recoveries": "1",
"indices.recovery.max_bytes_per_sec": "40mb"
}
}
- Use absolute disk watermarks (react before disks are critically full)
Adjust GB to your disk sizes.
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.disk.watermark.low": "25gb",
"cluster.routing.allocation.disk.watermark.high": "20gb",
"cluster.routing.allocation.disk.watermark.flood_stage": "10gb"
}
}
- Balance by disk and shard count (keep free space and shard counts close)
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.balance.disk_usage": "0.60",
"cluster.routing.allocation.balance.shard": "0.35",
"cluster.routing.allocation.balance.index": "0.05"
}
}
Result: disk usage is now more even across nodes, shard counts per node are close, and JVM stays stable (no circuit breaker trips during rebalancing).