Snapshots not working

Hi ,

I have installed and configured Elastic-search (Multi Node) Cluster on RHEL 6.9.

Master : 192.168.2.79

Data-Node1 : 192.168.2.80

Data-Node2 : 192.168.2.81

I have set path.repo: ["/home/es-backup/es_backup_basic"] and permission for /home/es-backup/es_backup_basic 0777 in all 3 nodes.

When I have executed the below command :

[root@mcspm2 ~]# curl -X PUT "http://192.168.2.79:9200/_snapshot/my_unverified_backup?verify=false" -H 'Content-Type: application/json' -d'{"type": "fs","settings": {"location": "my_unverified_backup_location"}}'

{"acknowledged":true}

root@mcspm2 ~]# curl -X POST "http://192.168.2.79:9200/_snapshot/my_unverified_backup/_verify"

{"error":{"root_cause":[{"type":"repository_verification_exception","reason":"[my_unverified_backup] [[A3w4wz7jSAG2nrzhEjLS0w, 'RemoteTransportException[[es-data2][192.168.2.81:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[my_unverified_backup] a file written by master to the store [/home/es-backup/es_backup_basic/my_unverified_backup_location] cannot be accessed on the node [{es-data2}{A3w4wz7jSAG2nrzhEjLS0w}{1iPxzhCOR8O77bc-x2mLpA}{192.168.2.81}{192.168.2.81:9300}]. This might indicate that the store [/home/es-backup/es_backup_basic/my_unverified_backup_location] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [dwRRAbBcTEqJSQbG-bxc7A, 'RemoteTransportException[[es-data1][192.168.2.80:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[my_unverified_backup] a file written by master to the store [/home/es-backup/es_backup_basic/my_unverified_backup_location] cannot be accessed on the node [{es-data1}{dwRRAbBcTEqJSQbG-bxc7A}{rp1v9pBlTfWVl5h-6yiqCA}{192.168.2.80}{192.168.2.80:9300}]. This might indicate that the store [/home/es-backup/es_backup_basic/my_unverified_backup_location] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];']]"}],"type":"repository_verification_exception","reason":"[my_unverified_backup] [[A3w4wz7jSAG2nrzhEjLS0w, 'RemoteTransportException[[es-data2][192.168.2.81:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[my_unverified_backup] a file written by master to the store [/home/es-backup/es_backup_basic/my_unverified_backup_location] cannot be accessed on the node [{es-data2}{A3w4wz7jSAG2nrzhEjLS0w}{1iPxzhCOR8O77bc-x2mLpA}{192.168.2.81}{192.168.2.81:9300}]. This might indicate that the store [/home/es-backup/es_backup_basic/my_unverified_backup_location] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [dwRRAbBcTEqJSQbG-bxc7A, 'RemoteTransportException[[es-data1][192.168.2.80:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[my_unverified_backup] a file written by master to the store [/home/es-backup/es_backup_basic/my_unverified_backup_location] cannot be accessed on the node [{es-data1}{dwRRAbBcTEqJSQbG-bxc7A}{rp1v9pBlTfWVl5h-6yiqCA}{192.168.2.80}{192.168.2.80:9300}]. This might indicate that the store [/home/es-backup/es_backup_basic/my_unverified_backup_location] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];']]"},"status":500}

[root@mcspm2 ~]#

Please provide us a solution.

Thanks

The repository must be a shared file system accessible by all nodes in the cluster. The error message indicates that this is not the case.

Hi @Christian_Dahlqvist

For cluster data backup, We must have a shared file system right ?

Is there any other way to back my all cluster data and restore accordingly.

Thanks,

Yes, a shared file system is required. There is also plugins available that allows you to store snapshots in e.g. S3 or HDFS.

Hi @Christian_Dahlqvist

Can shared file system be configured without SAN or NAS ? If yes, then how?

Thanks,

Not that I know of.

Hi @Christian_Dahlqvist,

We have configured shared file system.

We are running below command :

GET /_snapshot/_all

Response is :

{
"my_backup": {
"type": "fs",
"settings": {
"location": "/home/demo/"
}
},
"my_fs_backup": {
"type": "fs",
"settings": {
"compress": "true",
"location": "/home/demo/"
}
}
}

POST /_snapshot/my_fs_backup/_verify

Response is :

{
"error": {
"root_cause": [
{
"type": "repository_verification_exception",
"reason": "[my_fs_backup] [[jQ15JZVdQoyOPMBMLDa8bQ, 'RemoteTransportException[[es-data1][192.168.2.80:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[my_fs_backup] store location [/home/demo] is not accessible on the node [{es-data1}{jQ15JZVdQoyOPMBMLDa8bQ}{ZcJ5hVJtSEG1cgJORQ3kMQ}{192.168.2.80}{192.168.2.80:9300}{ml.machine_memory=33654394880, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]]; nested: AccessDeniedException[/home/demo/tests-zOCThU0ATX20BCZg5QonCw/data-jQ15JZVdQoyOPMBMLDa8bQ.dat];']]"
}
],
"type": "repository_verification_exception",
"reason": "[my_fs_backup] [[jQ15JZVdQoyOPMBMLDa8bQ, 'RemoteTransportException[[es-data1][192.168.2.80:9300][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[my_fs_backup] store location [/home/demo] is not accessible on the node [{es-data1}{jQ15JZVdQoyOPMBMLDa8bQ}{ZcJ5hVJtSEG1cgJORQ3kMQ}{192.168.2.80}{192.168.2.80:9300}{ml.machine_memory=33654394880, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]]; nested: AccessDeniedException[/home/demo/tests-zOCThU0ATX20BCZg5QonCw/data-jQ15JZVdQoyOPMBMLDa8bQ.dat];']]"
},
"status": 500
}

can you tell us, why we are getting the above error.

Thanks,

I looks like the /home/demo/ path is not read and writable from all nodes in the cluster. Is this really a shared file system?

Hi @Christian_Dahlqvist,

I have created a directory 'test' under /home/demo and put a 'sample.txt' under /home/demo/test/

I can see the content of the /home/demo/test/sample.txt from every node.

Master : 192.168.2.79

Data-Node1 : 192.168.2.80

Data-Node2 : 192.168.2.81

path.repo is set in all node's elasticsearch.yml file : /home/demo

my storage is located at 192.168.2.82 under /home/elaback/

I can see sample.txt from all node's /home/demo/test/ and in 192.168.2.82 under /home/elaback/test/sample.txt

This is our observation. Please correct me if anything is wrong.

Thanks,

Can all nodes also write to this directory?

Hi @Christian_Dahlqvist,

yes, all nodes can write to this directory.

I have modified sample.txt from each node one by one. And each modification reflect to all node's sample.txt.

Thanks,

Are you testing this as the user Elasticsearch runs under?

Hi @Christian_Dahlqvist,

We have logged in as root user and modified sample.txt

ls -ltr /home/demo/test/

-rw-r--r-- 1 nfsnobody nfsnobody 64 Sep 6 16:33 sample.txt

Thanks,

You need to be able to read and write from all nodes as the user Elasticsearch runs as, which cannot be root.