How does a node behave with the failure of a data disk?

Does the node go offline?

or

Are only the shards in that path.data effected?

or

Something completely different?

Thanks.

1 Like

I am curious what the consensus from the ES community is on disk failures. I've read some stories, both success and horror about replacing "failed" drives. Supposedly, if the cluster goes yellow and you have a failed disk -- you can replace it because yellow means the primary shards are still okay; but I do not have definitive answers.

Hopefully someone smarter than I will chime in. Hate depending upon random Google Group answers or non green checked stack overflow responses. ;(

Adding to this, I came across java.lang.OutOfMemoryError: Java heap space when my disc was full and all clients started throwing NoNodeAvailableException. Hope ES provide some way to catch this type of exception early so that we wont loose data.

Depends on how you have things setup - are you using path.data settings to each disk? Are you using RAID, if so what level?

This is not really related to the topic at hand.

However look for disk threshold watermarks in the docs, we do try to prevent this sort of thing.

1 Like