Questions on elasticsearch cluster data backup & restore

ahrtr · July 28, 2018, 3:41pm

I have a couple of questions as below on cluster data backup & restore, thanks.

There are 3 nodes in the cluster, is it OK to use whichever node to execute the backup command as below?

PUT _snapshot/my_backup/snapshot_1

The three nodes in the elasticsearch cluster are in three different domains, and each domain has separate shared storage system. The shared volume(storage system) in each domain can only be visible to the nodes in the same domain. So what I can do is to attach a different volume to each es node in each domain. But based on the document, The shared filesystem path must be accessible from all nodes in your cluster! I can make sure each node has the same mount path, i.e. /mount/backup, but actually the backend volumes are different.

So my question is that Can I backup & restore data in such case?

Christian_Dahlqvist · July 28, 2018, 3:45pm

The shared filesystem used for snapshots must indeed be available to all nodes, so what you are describing will unfortunately not work. You might however be able to use an S3 backed repository instead if that is allowed.

ahrtr · July 29, 2018, 3:53am

@Christian_Dahlqvist . Thanks for the answer. I am not allowed to use AWS S3 or Azure cloud. Instead, I can only use the storage service provided by our own Cloud Infrastructure. So It seems that I have to implement a plugin for elasticsearch to support this. Is it feasible?

What do you think of question 1? Actually I deployed elasticsearch in a kubernetes cluster, and I started 3 elasticsearch PODs. When I execute the backup command "PUT _snapshot/my_backup/snapshot_1" using the kubernetes service's cluster IP address, the command was actually passed to one of the backend elasticsearch POD. So only one of the elasticsearch instance can receive the backup command. I am not sure whether there are any potential problems. Can you please clarify this? Thanks.

Christian_Dahlqvist · July 29, 2018, 7:45am

Any node in the cluster can receive any command. Did you verify that your three nodes successfully formed a cluster, e.g. through the _cat/nodes API? Did you set minimum_master_nodes to 2 to avoid any split-brain scenarios as described here?

ahrtr · July 29, 2018, 9:58am

@Christian_Dahlqvist Yes, I think the three nodes should have already formed a cluster successfully. I recalled that each elasticsearch instance generated a log indicating joining the cluster successfully. But it's a good point to verify this using "_cat/nodes" API, I will try this later, thanks for the info.

Do you think is it feasible to implement an elasticsearch plugin to support our own storage service, just in the same way as what had been done for AWS s3 or Azure cloud?

Christian_Dahlqvist · July 29, 2018, 10:01am

I do not know as I have never developed a plugin. You should however be able to use one of the existing plugins as a template/example, which should make it easier than creating one from scratch.

ahrtr · July 29, 2018, 10:03am

@Christian_Dahlqvist . OK, it makes sense.

Thank you for your help!

system · August 26, 2018, 10:03am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Snapshot and Restore for ES Cluster with Remote FTP Address Elasticsearch	1	1761	January 11, 2019
Elasticsearch - backup a cluster Elasticsearch	3	367	January 17, 2019
Elastic Search snapshot backup with multi node and multiple indices Elasticsearch	4	956	August 30, 2019
Snapshot and restore Elasticsearch snapshot-and-restore	3	295	September 22, 2021
Elastic search cluster backup Elasticsearch	1	280	September 25, 2018

Questions on elasticsearch cluster data backup & restore

Related topics