Unable to allocate shards in ILM cold phase

hi,
I have a 2 node cluster running 7.10.2. I have defined ILM policy to move index from hot-warm-cold phases. However the transition from warm to cold phase is not happening. I have not explicitly set any node roles. So I expect all nodes to perform all roles. But when I run explain on cluster allocation, I see below error that "index.routing.allocation.require" - cold condition was not met. How do I fix this issue?

"node_allocation_decisions" : [
    {
      "node_id" : "GNR2p8B7QaeL86xWa6o3kQ",
      "node_name" : "mynodename",
      "transport_address" : "123.123.123.123:9300",
      "node_attributes" : {
        "xpack.installed" : "true",
        "transform.node" : "true"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "filter",
          "decision" : "NO",
          "explanation" : """node does not match index setting [index.routing.allocation.require] filters [type:"cold"]"""
        }
      ]

Thanks!

What does your policy look like, and what's the output from an explain of it?
What's the output from GET /_cat/nodeattrs?v?

nodeattrs shows this output.

node        host                 ip              attr            value
mynodename1 myhost1.mydomain.com 123.123.123.123 xpack.installed true
mynodename1 myhost1.mydomain.com 123.123.123.123 transform.node  true
mynodename2 myhost2.mydomain.com 123.123.123.124 xpack.installed true
mynodename2 myhost2.mydomain.com 123.123.123.124 transform.node  true

My ILM policy

PUT _ilm/policy/my_data_rollover_policy
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_size": "3gb",
            "max_age": "15d"
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "allocate": {
            "number_of_replicas": 2,
            "include": {},
            "exclude": {}
          },
          "shrink": {
            "number_of_shards": 5
          },
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "allocate": {
            "include": {},
            "exclude": {},
            "require": {
              "type": "cold"
            }
          }
        }
      },
      "delete": {
        "min_age": "60d",
        "actions": {
          "delete": {
            "delete_searchable_snapshot": true
          }
        }
      }
    }
  }
}

hi @warkolm,

Any suggestions?

Thanks!

hi @warkolm,

Any suggestions? We are stuck on this issue.

Thanks!

You'll need to check your config to make sure that you have nodes with the cold role allocated.

Like i had mentioned earlier, i have not put any specific node roles. When I run _nodes, it shows that all nodes have data_cold role. Do I need to rerun the policy after upgrading ES so that policy will pick up new role names? Currently we have upgraded to 7.10.2, but the policy was created on 7.2.0 and then updated on 7.8.1. So you think that could be the issue?

"roles" : [
        "data",
        "data_cold",
        "data_content",
        "data_hot",
        "data_warm",
        "ingest",
        "master",
        "remote_cluster_client",
        "transform"
      ],

The allocate action in the cold phase is relying on node attributes.

        "actions": {
          "allocate": {
            "include": {},
            "exclude": {},
            "require": {
              "type": "cold"
            }
          }

This configuration is looking for a node attribute with name type and value cold. I believe you need to tag your nodes with the attribute type and the corresponding value (warm for warm nodes, cold for cold nodes etc) and update your policy to also use node attributes in the warm phase as well (as it's currently not defining any allocation rules in the allocate action so ILM will automatically inject a migrate action).
Also be aware that new indices will receive the _tier_preference routing automatically as described here - there're ways to opt out of this as well, described in the same post (you usually don't want to mix data tier allocation and custom node attribute values)

Alternatively you can migrate to only using node roles allocation (data tiers) as described in the migration guide.

Thanks @andreidan I updated the ILM policy to remove type requirement. After that, the rollover worked properly.