How can I restore logs from /usr/share/elasticsearch/data/nodes/0?

I am using Elasticsearch as a backend to save logs collected from Fluentd logging agent. Specifically, I've set up an EFK logging architecture in my Kubernetes cluster. (AWS EKS cluster to be specific)

I've mounted the container volume of Elasticsearch's /usr/share/elasticsearch/data into an EBS volume.

The question is: using this volume, is there a way to restore the logs? I've cd-ed into nodes/0/indices and saw bunch of folders and files in it -- but I couldn't figure out what they are. I've attached the capture of it.

Welcome to our community! :smiley:

Please don't post pictures of text, logs or code. They are difficult to read, impossible to search and replicate (if it's code), and some people may not be even able to see them.

I think it might be better to stop back and ask why you are doing this?

Okay I will not post pictures. Thanks for letting me know.

As mentioned above, I've set up an EFK logging architecture in my AWS EKS cluster for production usage.

For log retention strategy, I want old log data (say, 60-days old) to be automatically removed from the EBS volume (where Elasticsearch's container is mounted on).

But at the same time, as our client may request for log data that are older than our criterion (of 60-days), we are planning to take a snapshot of the EBS volume periodically, so that log data that are older than 60-days can also be restored from the snapshots taken before.

Given a snapshot of the EBS volume, then, I must be able to restore logs directly from it to meet my needs.

If there are other ways or better practices to restore logs, I am also willing to follow them.

Thanks in advance.

Use the inbuilt Elasticsearch snapshot, no other approach is supported sorry.

That will as Mark said not work. Elasticsearch performs consistency checks on data on disk so in any way altering the data directory will make all the data invalid as consistency checks will fail.

The only way to snapshot data is through the snapshot and restore APIs, which allows you to back up data to S3 or a shared file system repository.

Thanks. I will take a look into the link you provided.

The reference manual contains clear guidance on this topic:

Taking a snapshot is the only reliable and supported way to back up a cluster. You cannot back up an Elasticsearch cluster by making copies of the data directories of its nodes. There are no supported methods to restore any data from a filesystem-level backup. If you try to restore a cluster from such a backup, it may fail with reports of corruption or missing files or other data inconsistencies, or it may appear to have succeeded having silently lost some of your data.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.