How to manage data retention time in ElasticSearch

Hello to all!

I'm using Logastash to send syslog data to ElasticSearch and I'm browsing it using Kibana. I have 2 questions regarding log backup:

  1. I would like to keep this data for at least 1 year before it is deleted. Where can I set the length of data retention? Is it set on each index pattern? That way, depending on each source, we can define a different retention time.

And what is the default data retention for each index pattern ?

  1. Concerning the backup of the server that hosts ElasticSearch. Are there any particular requirements or is a backup of the VM with a tool like Veeam enough? (As the data is saved as JSON files)

Thanks in advance for your help.

Take a look at Index Lifecycle Management which was designed to handle exactly this. Also look at this blog post around rollover and time-based indices.

Elasticsearch does by default not delete any data, so indexed data will be retained forever.

The only supported way to backup and restore Elasticsearch is through the snapshot/restore APIs. Have a look at Snapshot Lifecycle Management, which will help you manage this.

1 Like

Thanks for your answer.

Now I have created an index pattern that allows me to view my data. I also created an Index Lifecycle Policies to define to delete the data after 500 days.

I see that I can apply the Index Lifecycle Policies to an index template and not to an index pattern. How can I create this?

And a second question. When creating an Index Lifecycle Policies, we talk about "Rollover", and we say that we create a new index after a certain number of days or a certain size is reached. But which index are we talking about? Is it for example if we say "when my index pattern reaches 5000 entries, we change index pattern"?
Or is it another index we are talking about. It's a bit confusing for me.

Thanks in advance.