Resolve unassigned shards?

I am an unassigned shards issue on a newly setup cluster where the ilm policy is not moving the shares from hot to warm.

The ilm policy is set to rollover after 1d or 50gb but the allocation is complaining about node does not match "node does not match index setting".

...
      "phases" : {
        "warm" : {
          "min_age" : "7d",
          "actions" : {
            "set_priority" : {
              "priority" : 50
            }
          }
        }
...
        "hot" : {
          "min_age" : "0ms",
          "actions" : {
            "rollover" : {
              "max_primary_shard_size" : "50gb",
              "max_age" : "1d"
            },
            "set_priority" : {
              "priority" : 100
            }
          }
        }
# GET /_cat/health?v
epoch      timestamp cluster                          status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1624454091 13:14:51  14e1656e7d5444eb8cc308b30ae1418a yellow          5         4     76  76    0    0        1             0                  -                 98.7%
# GET /_cluster/allocation/explain?pretty
{
  "index" : "filebeat-7.12.1-2021.06.23-000217",
  "shard" : 0,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "INDEX_CREATED",
    "at" : "2021-06-23T09:34:10.741Z",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "oV7Ky2keQyyFODvx5MHMuQ",
      "node_name" : "instance-0000000009",
      "transport_address" : "10.46.88.30:19523",
      "node_attributes" : {
        "logical_availability_zone" : "zone-0",
        "server_name" : "instance-0000000009.14e1656e7d5444eb8cc308b30ae1418a",
        "availability_zone" : "westus2-3",
        "xpack.installed" : "true",
        "data" : "cold",
        "instance_configuration" : "azure.data.highstorage.e16sv3",
        "transform.node" : "false",
        "region" : "unknown-region"
      },
      "node_decision" : "no",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "filter",
          "decision" : "NO",
          "explanation" : """node does not match index setting [index.routing.allocation.require] filters [data:"hot"]"""
        }
      ]
    },
    {
      "node_id" : "wLGNkgAYQ7yqQgUtedIiXw",
      "node_name" : "instance-0000000010",
      "transport_address" : "10.46.88.61:19471",
      "node_attributes" : {
        "logical_availability_zone" : "zone-0",
        "server_name" : "instance-0000000010.14e1656e7d5444eb8cc308b30ae1418a",
        "availability_zone" : "westus2-3",
        "xpack.installed" : "true",
        "data" : "frozen",
        "instance_configuration" : "azure.es.datafrozen.lsv2",
        "transform.node" : "false",
        "region" : "unknown-region"
      },
      "node_decision" : "no",
      "weight_ranking" : 2,
      "deciders" : [
        {
          "decider" : "filter",
          "decision" : "NO",
          "explanation" : """node does not match index setting [index.routing.allocation.require] filters [data:"hot"]"""
        },
        {
          "decider" : "disk_threshold",
          "decision" : "NO",
          "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [9.980344772338867%]"
        },
        {
          "decider" : "dedicated_frozen_node",
          "decision" : "NO",
          "explanation" : "this node's data roles are exactly [data_frozen] so it may only hold shards from frozen searchable snapshots, but this index is not a frozen searchable snapshot"
        }
      ]
    },
    {
      "node_id" : "NBchKHq-R-O2qp_XrYDHCQ",
      "node_name" : "instance-0000000007",
      "transport_address" : "10.46.88.23:19757",
      "node_attributes" : {
        "logical_availability_zone" : "zone-0",
        "server_name" : "instance-0000000007.14e1656e7d5444eb8cc308b30ae1418a",
        "availability_zone" : "westus2-3",
        "xpack.installed" : "true",
        "data" : "warm",
        "instance_configuration" : "azure.data.highstorage.e16sv3",
        "transform.node" : "false",
        "region" : "unknown-region"
      },
      "node_decision" : "no",
      "weight_ranking" : 3,
      "deciders" : [
        {
          "decider" : "filter",
          "decision" : "NO",
          "explanation" : """node does not match index setting [index.routing.allocation.require] filters [data:"hot"]"""
        }
      ]
    },
    {
      "node_id" : "6la3uCyRQaOTDo55VQOOEw",
      "node_name" : "instance-0000000012",
      "transport_address" : "10.46.88.125:19791",
      "node_attributes" : {
        "logical_availability_zone" : "zone-0",
        "server_name" : "instance-0000000012.14e1656e7d5444eb8cc308b30ae1418a",
        "availability_zone" : "westus2-1",
        "xpack.installed" : "true",
        "data" : "hot",
        "instance_configuration" : "azure.data.highio.l32sv2",
        "transform.node" : "true",
        "region" : "unknown-region"
      },
      "node_decision" : "no",
      "weight_ranking" : 4,
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "a copy of this shard is already allocated to this node [[filebeat-7.12.1-2021.06.23-000217][0], node[6la3uCyRQaOTDo55VQOOEw], [P], s[STARTED], a[id=eFlQVZQ0QqKnphk43b7b8Q]]"
        }
      ]
    }
  ]
}

# GET filebeat-7.12.1-2021.06.23-000217/_ilm/explain
{
  "indices" : {
    "filebeat-7.12.1-2021.06.23-000217" : {
      "index" : "filebeat-7.12.1-2021.06.23-000217",
      "managed" : true,
      "policy" : "filebeat",
      "lifecycle_date_millis" : 1624440850656,
      "age" : "3.63h",
      "phase" : "hot",
      "phase_time_millis" : 1624440850976,
      "action" : "rollover",
      "action_time_millis" : 1624440851432,
      "step" : "check-rollover-ready",
      "step_time_millis" : 1624440851432,
      "phase_execution" : {
        "policy" : "filebeat",
        "phase_definition" : {
          "min_age" : "0ms",
          "actions" : {
            "rollover" : {
              "max_primary_shard_size" : "50gb",
              "max_age" : "1d"
            },
            "set_priority" : {
              "priority" : 100
            }
          }
        },
        "version" : 16,
        "modified_date_in_millis" : 1623638569137
      }
    }
  }
}

Someone has similar issue but no response...

What's the output from _cat/nodes?v?

ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.46.88.125           28         100  16    8.16    9.12     8.39 himrst    *      instance-0000000012
10.46.88.30            61          86  19   13.27    9.39     8.33 cr        -      instance-0000000009
10.46.88.23            40          95   3    0.65    0.95     1.17 rw        -      instance-0000000007
10.46.88.61            63          94   2    0.81    0.66     0.68 f         -      instance-0000000010
10.46.88.37            60          90   2    1.98    2.17     2.10 lr        -      instance-0000000013

Hi @lchan - I checked your deployment and noticed that you have an active subscription with us. You have been in touch with our support team and it seems that the problem has been resolved.

As discussed with our team, filebeat-7.12.1-2021.06.23-000217 replica shard could not be assigned in the hot phase because there is only one hot node in your deployment. The same copy of a shard cannot be allocated in the same node.

Feel free to engage with our support team at any time, it is one of the perks in your subscription :slight_smile:

1 Like

Hi @ropc,

Sometimes I post in here so other users can take advantage.

Thanks for taking a step of above and beyond for looking out on this issue (By the way, how do you find out my deployment ID? By ip address? Even when I open a support ticket, I will be asked for deployment ID if I forgot to include one)

Yes The issue(s) was two folded. The unassigned shards DO NOT affect ILM which I thought it would.

  • Unassigned shards - Were cause by the default number of replica of 1.
  • The index has a data:hot allocation requirement on "index.routing.allocation.require.data" from a legacy template on .cloud-hot-warm-allocation-0 and filtering indices.

Sometimes I post in here so other users can take advantage.

Thank you for being active in our community. I love the spirit! :slight_smile:

Thanks for taking a step of above and beyond for looking out on this issue (By the way, how do you find out my deployment ID? By ip address?

The cluster ID was in the result of the _cat/health provided earlier. I used this information to correlate your deployment in our backend systems.

Even when I open a support ticket, I will be asked for deployment ID if I forgot to include one)

Correct - the support team may require confirmation of the deployment ID (in case your account has many deployments). A good practice is to always provide the deployment ID whenever you open a support case with us - this speeds-up the overall investigation.

The index has a data:hot allocation requirement on "index.routing.allocation.require.data" from a legacy template on . cloud-hot-warm-allocation-0 and filtering indices.

That's right - this is actually related to Migrate index allocation filters to node roles.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.