I have customer requirement where each customer can opt XGB of data or Y days of data and n no of records. Customer is saying they only want this much, if more data is coming, please delete oldest among all. How can we achieve using ILM? Here we do not want so much of rollover as the index and shard count increases, and we can trap in over sharding issues. Here we have separate index for each customer. Currently we are using 7.16 ES.
ILM works by deleting complete indices, so in order to use this you would need to create time-based indices, e.g. using rollover. To answer this there are a few additional questions that would be useful to have answers to:
How many customers do you need to support?
How much data does each customer typically have?
How long retention period do you need to support?
Do you have control over the data being ingested and the mappings?
Are you giving access to this data through an API layer that you control or can the customer interact directly with the data, e.g. through Kibana?
ILM works by deleting complete indices, so in order to use this you would need to create time-based indices, e.g. using rollover. To answer this there are a few additional questions that would be useful to have answers to:
How many customers do you need to support?
Around 1000 customers
How much data does each customer typically have?
200 customers can have 40000 per day , 400 customers can have 4 million records per day, 400 customers can have 40 million records per day
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.