I had daily indices and a fixed retention time of 365 days via ILM. Now I changed to monthly indices to increase the index size. The problem now is that when the oldest index will be deleted, all data from now-(365-30) will be deleted. What is a good way to get around this? Summarized: I want an exact retention time of 365 days. Not more and not less. A naive idea is to reindex indices that become 11 months old into "daily" indices but I hope there are better ways.
ILM deletes complete indices so if you have monthly indices that is the deletion granularity. ILM therefore never allows exact retention. By changing to daily indices you can get more accurate but data will still only be deleted once per day.
I know. That's why I am seeking for a workaround. I just stumbled upon Delete API and Delete by query API. Until now I thought it is not possible to delete or alter documents from an index without deleting the whole index. I am wondering if they really simply delete documents or if they cause recreation of an index. I think I will do some performance tests with delete operations.
Delete by query deletes documents. Deleting documents this way is however much less efficient than simply deleting full indices as each delete in reality is an update with a tombstone record. It therefore results in much higher load and resource usage on the cluster.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.