Not able to take snapshot


(Vikas Gopal) #1

Hi Experts,

I am not able to take snapshot of ES index. This is what I have done

  1. I have created one folder and then shared this to every one with read and write permission .
  2. Now I mention this path in ES.yml file like path.repo: ["C:/backup"]
  3. Then I register this repository using Sense
    PUT /_snapshot/backup -d
    {
    "type": "fs",
    "settings": {
    "compress": "true",
    "location": "c:/backup"
    }
    }
  4. I can see repository information with the following command
    GET /_snapshot
  5. Now when I fire curator command to take snapshot I am getting an error

C:\Users\Administrator>curator snapshot --repository c:/backup indices --all-indices

2015-07-09 06:27:48,054 INFO Job starting: snapshot indices
2015-07-09 06:27:48,054 WARNING Overriding default connection timeout.New timeout: 21600
2015-07-09 06:27:48,069 INFO Matching all indices. Ignoring flags other than --exclude.
2015-07-09 06:27:48,085 ERROR Failed to verify all nodes have repository access.
2015-07-09 06:27:48,085 WARNING Job did not complete successfully.

Please suggest me what I am doing wrong , am i missing any step ?

Thanks
Vikas


(Magnus B├Ąck) #2

When using an fs repository all cluster nodes must be able to access the path.


(Vikas Gopal) #3

Since I am using a single machine architecture , so I just have single ES node acting as a master . And yes, I am able to access this shared folder , using \\ ipaddress .


(David Pilato) #4

Check also C: vs c:
I think you should be consistent here


(Vikas Gopal) #5

Yes, Pilato , it is consistent


(Vikas Gopal) #6

OOPS my bad , I was mentioning wrong repository name in the cutator command so it should be like

C:\Users\Administrator>curator snapshot --repository backup indices --all-indices

so it is successful this time .


(Alessandro Piroddi (Ales)) #7

Hi everyone,

I pop in since I'm encountering the same problem.

I'm running ElasticSearch on a Ubuntu Server 12.04. I need to implement an automatic snapshot rotation and for this reason I tried to use curator in order to implement some sort of snapshot scheduler.

  1. I have added the following row inside the elasticsearch.yml file:

    path.repo: /var/backups/elasticsearch/

  2. Changed the permissions for the aforementioned folder (so that every user can read/write it/in it):

    sudo chmod -R 777 /var/backups/elasticsearch/

  3. Run the following command:

curl -XPUT 'http://localhost:9200/_snapshot/backup' -d '{
"type": "fs",
"settings": {
"location": "/var/backups/elasticsearch/",
"compress": true
}
}'

  1. Checked whether the repository was created correctly:

    curl -XGET 'http://localhost:9200/_snapshot/backup?pretty'

    {
    "backup" : {
    "type" : "fs",
    "settings" : {
    "compress" : "true",
    "location" : "/var/backups/elasticsearch/"
    }
    }
    }

  2. Tried to create a snapshot through curator but I get the following error:

    curator snapshot --repository backup indices --all-indices --exclude kibana-int

    2015-07-09 17:42:52,529 INFO Job starting: snapshot indices
    2015-07-09 17:42:52,529 WARNING Overriding default connection timeout. New timeout: 21600
    2015-07-09 17:42:52,549 INFO Matching all indices. Ignoring flags other than --exclude.
    2015-07-09 17:42:52,594 ERROR Failed to verify all nodes have repository access.
    2015-07-09 17:42:52,595 WARNING Job did not complete successfully.

The paths are correct, the permissions too. I honestly have no idea about what's wrong there.
I've tried to run the script in --dry-run mode and it "works" correctly.
Any suggestion?

Thank you very much
Alessandro

ps, I tried to create a snapshot manually with the following command:

curl -XPUT "localhost:9200/_snapshot/backup/test?wait_for_completion=true"

and it was created correctly


(Aaron Mildenstein) #8

This message indicates that not all nodes in your cluster have write access to the repository, and that is why Curator fails to make the snapshot. Curator performs a repository availability check before attempting a backup as a safety check.

If you are 100% certain that this message is in error (and it shouldn't be with a regular fs repository--occasionally s3 and azure repos are slower to respond so they give a false-negative), upgrade to Curator 3.2.0 and use the --skip-repo-validation flag.


(Alessandro Piroddi (Ales)) #9

Continuing the discussion from Not able to take snapshot:

Thank you for your prompt reply.

I'm confident it is an error since regular snapshots through the command:

curl -XPUT "localhost:9200/_snapshot/backup/test?wait_for_completion=true"

work as they should. I already tried to use the --skip-repo-validation flag but it results in an error too (the snapshot is created but is stuck in a "Processing" status and data is not saved).

Edit: I tried to run the command again with the --skip-repo-validation flag and it seems it's working... let's give it a second

Re-edit: Yes, it seems it worked. I'm not entirely sure why it doesn't work without that flag but as long as there aren't any side effects this is great too!

Thank you again


(Aaron Mildenstein) #10

That this command works without error does not mean that the snapshot was completely successful. There have been occasions where all seemed well, but the snapshot was corrupted because one node was not writing properly. That's why the safety check is in place. If it were me and I was using the --skip-repo-validation flag, I'd check my elasticsearch logs on all nodes and/or perform a restore to another cluster to verify backups are working. That error does not generally happen without a reason.


(Vikas Gopal) #11

Though I am not from Unix/Linux background but the issue which i was facing was resolved by implementing following 2 things

  1. My folder was not shared properly , I was trying it on my office laptop and I was not aware even after sharing, my folder was not in 100 percent shared mode due to some restrictions, though I can see it as shared ,moreover the user who runs ES was not in the shared list . I was getting exact same error message and as you mentioned till point 4 it works fine for me as well .So to achieve this I done the same thing on AWS windows instance as it has all the administrator privilege , and it works.
  2. As I mentioned I was using wrong repository name , which is correct in your case .

(system) #12