Hello everyone,
I would like to know if it is possible to set up a snapshot policy that retrieves for example indexes only from the last 7 days. For example, I save my logs from my active directory with this format:
index => "winlogbeat-%{+YYYY.MM.dd}"
And I wanted to use this to recover that the last 7 daysI tested a few things with the settings like this:
Snapshots are automatically deduplicated to save storage space and reduce network transfer costs. To back up an index, a snapshot makes a copy of the index’s segments and stores them in the snapshot repository. Since segments are immutable, the snapshot only needs to copy any new segments created since the repository’s last snapshot.
Each snapshot is also logically independent. When you delete a snapshot, Elasticsearch only deletes the segments used exclusively by that snapshot. Elasticsearch doesn’t delete segments used by other snapshots in the repository.
If your concern is restore process, you can select indices manually while restoring snapshot. You don't need to restore all
My concern is the use of storage, but the following information does not allow me to answer my question unless the answer is no since I have not seen what I am looking for anywhere.
As noted above, ALL Elasticsearch snapshots are similiar to incremental. Basically the cluster (actually each node) looks at what segments it has to snapshot vs. what segments are already in the repository, and writes the missing ones. Plus a bunch of references, states, and other metadata. That’s it.
So, if you have daily, weekly snapshot jobs scheduled. You should not concern storage. Because storage used will not multiple with each snapshot data
Aaah! Actually present it this way is more logical, I thought badly because my first goal was to retrieve only my logs from last week and leave the others because they were test logs and were going to be deleted.
But if we remove this exceptional case indeed, you are right, given how the snapshot system works, I do not need to try to recover only certain logs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.