I'm setting up my first Elasticsearch cluster which is used for Logstash indexes, and have reached to figuring out backup and recovery.
Using snapshots is not an option for me, as none of the facilities it seems to require (shared filesystem or any kind of cloud) are available. I found some solutions online (one example: http://tech.superhappykittymeow.com/?p=296) which basically:
- back up one day's logstash index directory with tar
- read the mappings via ES API and store it in a restore script.
The restore script does the following:
- creates new index using mappings that are saved during backup
- extract the tar file which is created during backup
- restarts Elasticsearch.
Strangely, this approach doesn't work for me - after restarting Elasticsearch it cannot read the restored index because some files in index directory are not found.
What I'm trying to understand at this point is not so much "why doesn't it work" but "why is it even done this way". Because I have found what does work for recovery Simply:
- shut down Elasticsearch
- extract the tar file created during backup
- start Elasticsearch.
All the documents are there, and Kibana can successfully show the data. So I don't understand why separate steps for dealing with mappings are necessary. What trouble will I get myself into if I just keep tar'ing up the daily index directories?