Snapshot data based on query

I have a big index containing data of last 3 years, now I want to implement snapshot to solve and data mishap, I want to include only last 3 months data in snapshot and after that move to incremental snapshot, can I do this?

Snapshots work at the index level copying complete segments, so if all your data is in a single index you will need to snapshot all of it.

1 Like

thanks

why elastic search does not provide snapshot based on the timestamp query like mysql get snapshot of data created before some date, I am not able to understand this. Please share your thoughts on the same

Elasticsearch often handle considerably larger data volumes than yuor typical relational databases like MySQL does. I have seen clusters with over a petabyte of data. In order to take backups at that scale the backup process must be efficient in terms of computation and disk I/O and retrieving documents based on a query is much, much more expensive (results in lots of random access disk reads due to how Lucene works) than copying the full segments/index files that Lucene creates, which basically is what the snapshot/restore mechanism does.

In your case you could reindex the last 3 months based on a query into a new index/set of time-based indices and then remove the current index. If you have a large data volume this is likely to take time and result in a lot of disk I/O.

1 Like

thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.