ES5; Index size ~500GB, ShardSize ~45GB each.
I have a 120GB disk and only 2 of the shards of this index are allocated on this disk.
Currently, these replica shards are unassigned, with reason NODE-LEFT, doing a PEER recovery but failing due to Disk ThresholdDecider decision.
: TraceLevel="WARN" [org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDecider] after allocating, node [ABC] would have less than the required 0b free bytes threshold (-6042180175 bytes free), preventing allocation
The disk currently has > 40GB of free disk space and the unassigned replica shards are already ~43-44GB in size and probably need to sync only a few MBs from the primary to be up to date. Why is DiskThresholdDecider preventing allocation in this scenario?
- Why do we need free disk space > = shard size for Peer recovery when only a few MBs need to be recovered?
- The DiskThresholdDecider seems to be overly conservative in allowing allocation on a node based on disk space. Any ways to loosen up these constraints?