I am currently on a single-node infrastructure and I would like to know if the implementation of ILM was useful, I assumed that putting an ILM with the different phases would allow me to keep my data longer, since it seemed to me that in cold tier, they took up less space. But that's not the impression I have:
The different stages does change the storage size or resource usage unless you change index settings or perhaps forcemerge as part of phase transitions. If none of this is done the index is exactly the ame irrespective of phase. You can host all phases on a single node and use ILM to manage retention and use the different phases to apply or alter settings.
So if it doesn't reduce the size of the indexes, what I do is useless or does it still reduce the load on elasticsearch (on cpu or memory maybe ?). If it doesn't, it's useless for me now I guess.
ILM helps you manage retention by deleting old indices. If you do not do this the node will fill up and stop indexing. I would recommend implementing a hot-warm approach where you forcemerge your indices down to a single segment at the transition from hot to warm. This generally reduces heap usage.
So if I do that, my warm phase indexes will be read only (Force merges the index into the specified maximum number of segments. This action makes the index read-only.) and will take less space (perfect). But about the cold phase, if I don't add any parameter, there is no interest to activate it ? I can just set hot/warm/delete phases ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.