After reading various doc's and discussions about Elastic stack backup, I understand the following and would like someone to validate please :-
Elasticseach index backup using snapshot feature which will keep snapshot in shared file system which can be backed up using any tool.
No separate requirement for backup of logstash, kibana and beats as the data is stored in indexes which is getting backed up already.
Question:-
In case of cluster crash, let say i reinstalled the Elastic stack, so should be keep copy of config files directory as well ? e..g /etc/elasticsearch/* and similarly other components config?
Any standard schedule recommendation when we implement backup? e.g. once in a day etc.
What is standard way to schedule that backup? do you recommend some tool or we can write the script by own and run per schedule ?
Much of their data is indeed stored in Elasticsearch, but I think there are some config files that also need backing up:
Yes.
That's up to you, based on your desired RPO. Since snapshots are incremental, it's cheap to do them fairly frequently. Taking a snapshot every 30 minutes isn't unusual.
It's up to you, based on your operating environment. Curator might be useful.
Remember that backups always succeed, it's the restores that fail. The only way to be sure your backup process is working reliably is to regularly restore your system from backup.
Thanks @DavidTurner for your view. Could you please help me with second question about what all conf files?
one i understand may be /etc/elasticsearch, kibana,beats, logstash where we have configuration files. But do you think there are any other configuration files at OS level?
We can't really say which OS configuration you will need to copy. It depends very much on how you will provision new infrastructure during your restore process. The only way you can be sure that your backup process is capturing everything it needs is to regularly restore everything from backup.
@DavidTurner I understand this part. In general case let say OS is not crashed then just indices restore will help.
Let say in case, where Complete OS is crashed and I need to deploy fresh OS. in those cases config files would be handy else full reconfiguration should be done before restoring the indices backup.
Do we have any doc which covers such scenario?
@DavidTurner one more question i had one this part, since Elasticssearch snapshot backup is incremental, I can restore from any snapshot and will it have dependencies on older ones?
I am asking this because, in case i want to remove 30 days older snapshot, i should not loose the data.
Yes, the repository keeps the data around while there are any snapshots that still need it. The data is only deleted once there are no more snapshots that reference it.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.