About ILM policy

I have a client requirement to store the 2 years of data in my production environment.
So How many days of data I need to keep in Hot,Warm,Cold and Delete phase.

Please explain briefly.

Hi @Sandeep_Mishra,

Welcome! Tbh it depends on your searching needs. For your data, it would be useful to know if and for how long it's updated after the date, and also if you need the data to be searchable after that point. Do you have any performance considerations regarding searches on data held for longer?

I would recommend taking a look at the ILM lifecycle documentation and the phases. From there think about the ranges that need to be searchable or the indices that may be updated as that can help you identify indices suitable for the cold versus frozen tiers.

The general guidance is to keep recent data in hot that is more likely to be searched in hot (which is more costly storage) and then transition older data to warm and cold that is less likely to be searched for.

Hope that helps!

1 Like

Hi @carly.richmond

Thanks for your response.

I want to keep the data 30-60 days in hot phase and rest data into warm, cold and then delete after 730 days.

So please suggest. If I want to maintain 30 days of data in the hot phase, what is the ILM? Also, if I want to maintain 60 days of data in hot phase, what is the ILM?

Then I'll determine whether 30 days or 60 days are appropriate for my needs.

Note

  1. I have 3 node cluster and total disk space is 1.4TB ( I have kept all my 3 nodes as master and data node )
  2. Per month the total size of data is about 30gb approx.
    For 1 year it is 30GB * 12month= 360GB
    For 2 year it is 360GB * 2year= 720 GB approx.
    The total data size for 2year is 720 GB

Ok, and is this to be applied against existing data, new data going forward, or both? This tutorial should help you with creating the policy and the index template to apply the policy. If you follow through with your values it should work.

Hi @carly.richmond

I want to applied it to the new data going forward not on existing data.

Can you suggest how many days of data I keep in warm and cold phase for 2years data.

In hot phase I want to keep either 30 days or 60 days of data.

In that case I would define a policy and index template by following this tutorial.

With regard to how long to keep data in warm or cold, I would check with users on the likelihood they will access older data to find the warm and cold rollover days. I'm not sure if your data is associated with a regulated industry that normally gives guidance on how long data needs to be stored for that could help, and then costs are balanced against that.

From my own prior experience I would reserve warm for data that has a reasonable likelihood of being accessed, and cold for very unlikely dates that you would be comfortable with a slow query response time.

Hi there,

Are you able to tell me whether your cluster will remain at three nodes or are you planning to expand?

With the hot/warm/cold architecture, tiers are all supposed to be on different hardware. If you only have a three node cluster, then I am not sure how you plan to do it.

We have twelve data nodes in our logging cluster. All of our indexes are ILM but this is to ensure an efficient shard size and for automatic data removal after a given time period. Everything is on hot nodes.

We did look at having tiers but - in my opinion - you may find yourself going through multiple iterations of hardware per tier before you fix on one that works.

Hope this helps.