Hey — so I'm trying to set up an Elasticsearch cluster while making sure that I'm accounting for failure situations. I understand that Curator can easily snapshot to S3 (I'm running on AWS), which is great — I was considering running it as a cron job on every node (passing the --master-only
flag so it only runs on the elected master node) every N minutes. However, I'm not aware of an easy way to auto-restore in the case of a node that fails, assuming the node is not replicated. What is best practice here?
And while we're at it, what are some good ways to mitigate snapshot corruption?
Thanks!