Theory question: How to recover data from 3 node cluster after 2 nodes failed

Long time no see, but hello again.

We had a 3 Node cluster running very well till it stopped running very well. The SSD's of 2 Nodes failed in a timespan of 4 hours. This was in the middle of the night so no actions could be taken. We now had 2 corrupted nodes and one node still up. We though everything was fine because we had 3 nodes and every node had a replica of every index.

The problem was that after a restart of this node it could not elect itself as a master and could not start back up again. No matter what we tried we were not able to get it running. In the end we resorted to using unsafe cluster bootstraping to recover at least some of the data.

But this cant be the "best" way. So what is the actual best practice when 2 of 3 nodes fail? What is the best way to get the data back, to get the cluster back up again? Or is everyone just hoping that this never happens?

Thanks for your ideas :slight_smile:

What version are you on?

7.17.3

Was there any correlation between the failures? E.g. were they running in physical proximity so they might have suffered similar damage due to heat or vibration or some other environmental effect? Were they the same model of drive? The same manufacturing batch perhaps?

There's no watertight protection against multiple failures so it really depends how paranoid you want to be. Physically separating the nodes helps. Decorrelating their hardware helps. RAID helps, especially on the master nodes - dedicated masters don't need much storage but they do need it to be reliable. But ultimately it's all about probabilities and you can still be extraordinarily unlucky. In that case, the manual recommends restoring from a recent snapshot:

If you can’t start enough nodes to form a quorum, start a new cluster and restore data from a recent snapshot.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.