We are using curator job to move data from hot to warm nodes.
As of now daily indices are getting created,Since the shard count is high we are planning to change the indexing strategy from daily index creation to monthly.
I would like to know Since the indices would become read only only after the end of the month,Can we move the active(read-write) index from hot to warm node to reduce space consumption on hot nodes?
If yes,will it affect the performance of the cluster.
All these while since indices were getting created daily we used to move 4 days older indices from hot to warm.
If the hot nodes can not hold all the data if switching to monthly indices that kind of defeats the purpose. I would generally not recommend moving indices still being written to to warm nodes as these typically have slower storage.
You could instead switch to using the rollover API to get indices of a certain target size. This means each index could cover a variable time period, but Curator contains support for basing action based on the age of the index or the date interval contained in the index, so you should be able to configure it to move indices e.g. once it contains no data newer than X number of days.
Anything that uses resources, indexing into warm node or moving shards between nodes, can affect performance. If implementing rollover will take a while, maybe switching to weekly indices and perhaps reducing the number of primary shards is a good intermediate step?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.