We've got a ELK cluster running ES 5.0 for logs indexing, collecting them from Kafka using Logstash. Each node has 1TB in SSDs and 2TB in HDDs. At the moment, we're only using SSD volumes with Elasticsearch.
However, we only need a certain amount of data being stored on SSDs as hot data, then we would like to move it to HDDs. So classically, retiring cold data to a slower storage.
I found tips on how to make it, however I wanted to find out if there are good practices of achieving it, or if there is something recommended?
This blog post outlines how this works in a Hot/Warm architecture. This does however require the different types of storage to belong to different Elasticsearch nodes that can be tagged appropriately. If you have the two types of storage on the same hosts, you may therefore need to run multiple Elasticsearch nodes per host.
The Curator example in that blog post uses 3.0, which is outdated. The current release of Curator is 4.2.3, and the allocation action documentation is here.
Thanks you both @Christian_Dahlqvist and @theuntergeek. That's one of the approaches we were thinking of using. I thought there might be another ways of achieving this.
We'll test it and I will happily share our outcomes.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.