We have a cluster that was upgraded priori this week from version 5.6.16 to version 6.8.13 and upgrade succeeded and everything was running as expected.
One day later one of the data node reached disk watermark (low then high) and cluster started relocating but we soon ran to flood_stage watermark.
Trying to troubleshoot this issue we checked disk allocation and result was puzzling to me :
shards disk.indices disk.used disk.avail disk.total disk.percent node 914 367.8gb 293gb 26.4gb 319.5gb 91 xxx-2 941 344.4gb 281.5gb 10.6gb 292.1gb 96 xxx-3 617 239.6gb 194.3gb 5.7gb 200.1gb 97 xxx-1
To my understanding, disk.total is available volume, disk used is global usage (eg ES + rest of the world) disk.indices is usage by ES (that i would expect be a part of disk.used)
Can someone see what I misunderstood or tell me if this may be a known issue ?
My assumption on the watermark that were reached is not that data are growing but more that free space is eroding.