ILM Hot, Warm, Cold?

Hello,
When I try to setup ILM to make an index to become "Warm" after 30 days, then "Cold" after 60 days, and deleted after 90 days, I get the following:

You can't control shard allocation without node attributes.

Can someone help me with what to do next?

Thanks!

To get started, have a look here and here.

Hi @lamp123432

This warning is telling you that you need to configure your Elasticsearch nodes before using ILM in this way. Open your elasticsearch.yml and use the following setting:

node.attr.temp: warm

This will tell Elasticsearch where to "move" indices which have been declared as "warm" by your ILM. For more details refer to the links @egalpin posted

Hello, thanks for the reply. If I only have one node, does this mean I cannot use this feature?

Even with multiple node, one node should be dedicated to "warm" indexes while other nodes to hot and cold?

Hey
well if you have a single node setup, then a hot-warm-cold architecture doesn't make much sense. The background is cost saving. So if you store sequential, time-based data (for example logs), after a while the data gets less relevant and indexing/querying doesn't need to be as fast and performant any more.

Example:

  • 3 data nodes – one is "hot", one is "warm" and one is "cold"
  • The "hot" node is running on really expensive, super fast hardware, while "warm" and "cold" are running on cheap, old hardware
  • Now you can configure your Cluster so only the most recents logs (for example past 14 days) remain on the "hot" node --> super fast indexing and querying
  • After 14 days the indices are rolled over to "warm"
  • After 28 days your indices get rolled over to "cold"

Great explanation, I do have a client that has 3 nodes, so this will be beneficial.

But I have a second client with only one node, after some time Logstash cannot send anything to Elasticsearch because it complains that all the shards have been allocated. It happens after 3 months, then I have to delete old indexes. So that's why I thought ILM policies could help keep the data longer.

In this case, do you think roll-ups would be my only option?

Do you have a more precise error? All shards being "allocated" is actually a good thing and should not cause any problems.
Nevertheless, using Index Lifecycle Policies is best practice and saves you a lot of manual work and disc space when configured correctly :wink:
You can use those policies no matter how many nodes you have - just moving shards based on "temperature" won't be possible on a single node cluster

Here's the error:

Dec 04 18:01:09 node1 logstash[1578]: [2020-12-04T18:01:09,812][WARN ][logstash.outputs.elasticsearch][netflow][4bs9s93560b30d657e23c4241487efc] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"netflow-node1-2020.12.04", :routing=>nil, :_type=>"_doc"}, #<LogStash::Event:0x6c768f05>], :response=>{"index"=>{"_index"=>"netflow-node1-2020.12.04", "_type"=>"_doc", "_id"=>nil, "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"Validation Failed: 1: this action would add [6] total shards, but this cluster currently has [2000]/[2000] maximum shards open;"}}}}

Logstash cannot send new data to Elasticsearch, because a new index would be created and there is no enough shards available.

This seems to tell me that too many shards are open, I need to close a few of them. So without having dedicated "warm" or "cold" nodes, my only option is to delete indexes to get back some shards?

What is the output of:

GET /
GET /_cat/nodes?v
GET /_cat/health?v
GET /_cat/indices?v

If some outputs are too big, please share them on gist.github.com and link them here.