Backup logs

Hi there,

I have set up the ECK operator on my k8s cluster on my local machines with 3 nodes and integrated with Fluent Bit for log shipping. May I ask how can I backup the logs? I understand that duplicating the data directory is not recommended and the snapshot is the way to go. I have read this link: https://www.elastic.co/guide/en/cloud-on-k8s/0.9/k8s-snapshot.html but I did not see any example with local storage. Since I'm running on local machines, how can I set it up?

Any help is appreciated! thanks

You can configure a snapshot repository to backup your data to a shared filesystem you have mounted on your k8s hosts for example. It is probably less useful to backup to the same filesystem where your Elasticsearch data directories reside: it does not add much in terms of failure resiliency and creates only more I/O on the filesystem Elasticsearch is using. But maybe I am misunderstanding your question here a bit.

The process is not explicitly documented for ECK because the approach is the same as for Elasticsearch clusters running outside of Kubernetes. So the see https://www.elastic.co/guide/en/elasticsearch/reference/7.3/modules-snapshots.html in general and for available storage plugins see https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository.html

I have pretty much same setup like you
3 node - each has their own storage, nothing is shared.

hence I setup dir which is nfs mounted to two more node
/data01/snapshot ( this gets mounted on two node)
node1:/data01/snapshot /data01/snapshot

That means all node have /data01/snapshot = same dir.

now setup repository and do the backup.

I have a script that does automatic backup via unix cron. I have posted that script few month back. search on it if you like to do it via cron.

Thanks for pointing this out, I did think of putting the data dir and repository in the same shared filesystem. I should probably set the repository in another filesystem. Am I right to say that as long as I have my PV retained, ECK will pick up the data when it claims it, so its sort of like a backup since my PV is retained in a shared filesystem?
And may I also ask if there's any method to reclaim my PV if the elasticsearch went down and I have to reapply again? Because ECK is not deployed in Statefulset and thus it will not be able to reclaim the released PV, and manual intervention is required.

If I am using my own CephFS, I can just ignore the storage plugin and adapt the approach for the steps in modules-snapshots and thereafter, create a k8s CronJob with this example: https://www.elastic.co/guide/en/cloud-on-k8s/0.9/k8s-snapshot.html#k8s-setup-cronjob , is that right? However, i am not quite sure how I can do a PUT and create a repository in my ECK. Is there any objects to call on my elasticsearch.yaml?

Apologies for the multiple questions. Im still trying to understand more on ECK. Hope you can help me understand better :slight_smile:

Automatic Elasticsearch snapshot backup found it. thanks!

https://www.elastic.co/guide/en/cloud-on-k8s/0.9/k8s-accessing-elastic-services.html describes how you access Elasticsearch when running on Kubernetes with ECK

May i ask how do i set for path.repo on the yaml file? I couldnt find the endpoint/object to call in the yaml file.

I tried to PUT a repository with the following error:

{"error":{"root_cause":[{"type":"repository_exception","reason":"[my_backup] location [/var/backup] doesn't match any of the locations specified by path.repo because this setting is empty"}],"type":"repository_exception","reason":"[my_backup] failed to create repository","caused_by":{"type":"repository_exception","reason":"[my_backup] location [/var/backup] doesn't match any of the locations specified by path.repo because this setting is empty"}},"status":500}