Shard allocation fails during rebalancing

Update: We tested each of the 6 SSDs for 15 minutes

stress-ng --cpu 8 --dir 8 --filename 8 --hdd 8 --mmap 8 --readahead 8 --rename 8 --seek 8 --timeout 900

At least 2 disks had problems. See dmseg:

The stress-ng process was blocked from this moment on - CPU load and I/O went down immediately. So basically the same phenomenon.

We don't know if more disks are affected (it could be that the stress test didn't run long enough). But it would be very unlikely if several disks show the same problem at the same time, which behaved totally normal before. As I said, we only upgraded to Debian 11 and there were no problems before. So we will reinstall completely tomorrow but it seems that the problem is not caused by Elasticsearch.

1 Like