I am implementing a backup solution to our elasticsearch cluster.
All the indices on the whole cluster (10 data nodes, 3 master nodes) is stored on the same NetApp FAS8000 network share.
NetApp has incredibly fast snapshotting. It is desirable to use it if it works as intended.
Do you think we can rely on snapshotting with NetApp, or should we use Elasticsearchs own snapshot API?
Will elasticsearch be able to restore the backup from NetApps file snapshot?
Any experiences with NetApp snapshotting?
You could, but you would have to restore the entire cluster to the point you want, you won't be able to restore just a node or an index, as there is a bunch of things that happen when you take a snapshot in ES to tie all the shards together at that time.
That should not be a problem, since the snapshot is only for disaster recovery.
Thanks for the reply
I would still prefer to use the Elasticsearch snapshot and restore feature. When you request a snapshot it will prevent all existing segments from being deleted until the snapshot is complete which prevents the snapshot being corrupted because a file it requires was deleted mid-copy (e.g. because it has been merged into a new segment and is no longer required). No matter how fast the snapshotting of an external system, I would think that there will always be a window of time (however small) in which a file is deleted mid-snapshot.
Thank you for your insight. I will follow your advice and use elasticsearch snapshot. Better to be safe than sorry