Snapshot and archive old indices comparision

Hi all
I am using a script to take snapshot of an index and then using tar to compress them to save space.
What i wanted to know that what compression is the best for this kind of data. gunzip, bzip2 or something else. I need something that can compress the index with best compression ratio but wont damage the index data, the speed of compression is optional but if it is fast then great.
Thank you for your time.

@lusynda
If you take multiple snapshots in the same repository ElasticSearch will store only changed segment files for subsequent snapshots. If you tar and compress, repository for each snapshot, unchanged segment files will be present in each snapshot. Unless you are deleting older snapshot after compressing, compressed version will keep growing as it will contain older snapshots too. It will increase your restore time.

You may want to explore cheaper storage options instead. For ex. if you are on AWS, S3 infrequent access or S3 One Zone Infrequent access.

If you don't care about restore time, you may want to look at source only snapshot. I think requires paid license of X-Pack. https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshots-register-repository.html#snapshots-source-only-repository

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.