Elasticsearch backup only hot indices

I have a cluster of elastiсsearch consisting of 6 nodes. 5 indexes are created per day (each index consume about 10 to 250 gigabytes) we now have a snapshot repository (which is located on the NFS share) where these indices are backed up every day, but the fact is that in a year this snapshot repository began to weigh 40 terabytes and the fact is that these are payment logs that are required by law in my country store for 2 years, but we do not have so much space on NFS, and we had an incident when there was a bug on the NFS share and our snapshot repository was damaged and we could not restore it and then I thought to change the backup process, make a script that will be every day

  1. create a new folder on the NFS share with the current date (logz-backup-Year-Month-Day)
    2. register a snapshot repository to this folder
  2. take a snapshot
  3. delete a snapshot repository from elasticsearch
  4. make an archive from the folder

but the problem is that every time you take a backup to a new snapshot repository, the Elasticsearch makes a full backup of the contents, how can I make it so that elasticsearch only backs up new indexes to a new snapshot repository? I heard about Hot-warm architecture profile | Elasticsearch Service Documentation | Elastic, is it possible to do something that the indices are hot during the first 24 hours (for backup) and if it's older than day they switched to warm and so that they would not be backed up to a new snapshot repository? P.S I tried to compress my elasticsearch repository folder using gzip, and got a noticeable compression (from 40 TB it turned out to be 20TB)

You could do this in ILM, yes - Wait for snapshot | Elasticsearch Guide [7.14] | Elastic

It seems like a bit of a hack if there is legal requirements around keeping the data though.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.