Best practices for dynamic expiration

I'm not sure if that title makes much sense.

Right now, I have a fair amount of data coming in through logstash - about 7-10GB/day, and it all needs to stick around for 60 days. I currently write it to an index ("index-20220718") for example, based on the current date, and just delete any index older than 60 days. That's easy.

But things are changing.

Soon I'm going to have data coming in that may have different, dynamic expiration dates. Some might have to stick around 15 days, some 30 days, some 365 days, some 3650 days. The retention period is in a field that's in the data.

So what's the best way to index this? I thought of using date math in logstash, adding the number of days in the retention field to the current date, and storing it in an index like "index-20220802" if it had a 30 day retention period, and then deleting any index that's dated before today.

Is this the best way to do it? Is it going to complicate searches? I'm just setting up logstash/ES, not a programmer who has to work with it very much, and I want to keep things functional for them as well, of course :slight_smile:

If a customer changes the retention period for their account, I guess we'd have to go through and re-index everyone of their documents?

I feel like I must be missing other problems with this method too.

Is there a better way to do this that I'm just not seeing? We have hundreds of thousands of users, so a per-user index based on date is going to be impractical.


Welcome to our community! :smiley:

Your best bet is to setup indices based on ILM retention periods, and then index into those accordingly.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.