Snapshots on datastream

Hello,

I recently decided to moove from traditionals index to datastream.
Everything goes fine, template and ILM are working fine, my backing index (.ds-logs......) are daily rotated.
I've just got two issues with datastream, regarding snapshot.

First, i don't know how to manage efficiently my snapshot policy.
Indeed my ILM rotates my backing index everyday, and keep them for 1 month.
My snapshot policy is designed to run everyday.
But my issue is that everyday, every index are concerned by the daily snapshot.
So in my first snapshot (for exemple), i've got the daily backing index. In the second snapshot, i'll get the whole first backing index, plus the next day. 3rd snapshot will contains the 3 backing index etc etc.
How can i manage to just get the previous day when my snapshot policy runs ? Because at the end of the month, snapshots will get bigger and bigger, and i'll get 30 snapshot of the 1st day.
I tried to use regex (select .ds-logs-{now-1d/d} even if i know that snapshot exclude backing index, .ds-logs-* works, but not with the date filter). And for my restore i'll not be able to restore a single day of my snapshot, i'll have to restore the whole month...

And my second issue is about restore operation.
As mentioned in the documentation,

  • A snapshot can include a data stream but exclude specific backing indices. When you restore such a data stream, it will contain only backing indices in the snapshot. If the stream’s original write index is not in the snapshot, the most recent backing index from the snapshot becomes the stream’s write index.

So if i need to restore a 3 month ago snapshot, i'll have to get the actual day (which is my write index) in the snapshot ? I tried to restore a snapshot and indeed, new datas were writen on the most recent index in the snapshot, not on the backing index of the day. If that's the case, I just can't imagine how big my snapshot will be after 2 years, and it'll be impossible to restore...

What are the best practices for this kind of use cases ? I looked for help on this forum / documentation but didn't find anything that answer my needs...

Thanks in advance

Best regards

Snapshots are incremental, which means that if you are running one daily then it's probably you are just capturing changes for the previous day's index. If you delete a day's snapshot but there is data in it that a previous day requires, then that data will still be retained.

Consider having different repos on different timeframes. eg one for weekly, monthly, quarterly, yearly.

Hi Warkolm,

Thanks a lot for your return.
I actually managed to use a daily snapshot, then a monthly snapshot and will keep only the monthly when all the dailys are finished.

Best regards

1 Like