Create repository throws error on creation

I am trying to take a snapshot for a elasticsearch cluster.The design is the following. There are 3 VMs that run 1 master, 1 data and 1 client node each in Docker containers. Each VM has a volume attached for storing. So a cluster with 3 masters 3 clients 3 data nodes and 3 volumes.

After reading the documentation I created a separate backup volume that I attached to one of the VMs. After that i created a NFS between all 3 VMs that saves the data on the backup volume and then I modified the cluster and mounted the shared NFS directory as a volume to all the nodes in the cluster

So now each VM has the following:

VM1:

drwxr-xr-x  16 root   root     3560 Jul 24 10:30 dev
drwxr-xr-x   2 nobody nogroup  4096 Jul 24 11:49 elastic-backup
drwxr-xr-x  97 root   root     4096 Jul 24 14:04 etc
drwxr-xr-x   5 root   root     4096 Apr 27 12:53 home

VM2:

drwxr-xr-x   2 root   root     4096 Jul 24 13:52 bin
drwxr-xr-x   3 root   root     4096 Jul 24 12:09 boot
drwxr-xr-x   5 root   root     4096 Jan 27 16:41 data
drwxr-xr-x  16 root   root     3580 Jul 24 11:48 dev
drwxr-xr-x   2 nobody nogroup  4096 Jul 24 11:49 elastic-backup

VM3:

drwxr-xr-x   3 root   root     4096 Jul 24 15:28 boot
drwxr-xr-x   5 root   root     4096 Jan 27 16:41 data
drwxr-xr-x  16 root   root     3560 Jul 24 10:30 dev
drwxr-xr-x   2 nobody nogroup  4096 Jul 24 15:34 elastic-backup

When i create a file into it i can see it, modify or whatever and the action is visible from each VM.

Elasticsearch docker nodes:

drwxr-xr-x 1 elasticsearch elasticsearch   4096 May 15  2018 config
drwxr-xr-x 4 elasticsearch elasticsearch   4096 Jul 23 12:15 data
drwxr-xr-x 2 elasticsearch elasticsearch   4096 Jul 24 15:08 elastic-backup

Each docker elasticsearch node has the same directory mounted. I can see all the files from each node.

The problem is that whenever I try to create a snapshot repository i get the following error:

Call:

PUT /_snapshot/elastic-backup-1
{
  "type": "fs",
  "settings": {
    "location": "/usr/share/elasticsearch/elastic-backup"
  }
}

Error:

{
  "error": {
    "root_cause": [
      {
        "type": "repository_verification_exception",
        "reason": "[elastic-backup-1] [[some-id, 'RemoteTransportException[[master-2][VM2-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{master-2}{some-id}{some-id}{VM2-ip}{VM2-ip}{zone=AZ2}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [some-id, 'RemoteTransportException[[data-2][VM2-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{data-2}{some-id}{some-id}{VM2-ip}{VM2-ip}{zone=AZ2}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [some-id, 'RemoteTransportException[[data-1][VM1-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{data-1}{some-id}{some-id}{VM1-ip}{VM1-ip}{zone=AZ1}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [some-id, 'RemoteTransportException[[master-1][VM1-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{master-1}{some-id}{some-id}{VM1-ip}{VM1-ip}{zone=AZ1}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [some-id, 'RemoteTransportException[[data-3][VM3-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{data-3}{some-id}{some-id}{VM3-ip}{VM3-ip}{zone=AZ1}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];']]"
      }
etc ..

Anything I am doing wrong ? How can this be fixed

It looks like a config issue to do with your NFS setup but it's not clear exactly what needs fixing. Perhaps it helps to explain how verification works and what the error means. The master is creating a small file in the repository and then asking the other nodes whether they can read it, and the nodes are all indicating that they cannot. Common explanations include bad config (e.g. the nodes are not all looking at exactly the same path on the NFS server) or bad permissions (e.g. the master's file is hidden from or unreadable by the other nodes). There's no magic here, it's a pretty straightforward process, but this does indicate that there's a genuine problem in your setup that will stop snapshots from working properly.

A common NFS issue is that it uses the numeric UID/GID to determine permissions, not the names of the respective users. You must make sure that the numeric IDs are aligned on all the nodes or else set up the necessary mappings to align them all at the NFS level.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.