ILM stalled by waiting for shard to be active

AdamZ · July 14, 2021, 7:12am

Hello!

I have an ILM Policy configured but it stops at rollover phase.

I checked one of the indexes with this query:

GET /apm-7.10.0-error-000001/_ilm/explain?human

And the response is:

{
  "indices" : {
    "apm-7.10.0-error-000001" : {
      "index" : "apm-7.10.0-error-000001",
      "managed" : true,
      "policy" : "apm-rollover-30-days",
      "lifecycle_date" : "2021-01-13T22:25:52.555Z",
      "lifecycle_date_millis" : 1610576752555,
      "age" : "181.36d",
      "phase" : "warm",
      "phase_time" : "2021-02-12T22:34:14.211Z",
      "phase_time_millis" : 1613169254211,
      "action" : "migrate",
      "action_time" : "2021-02-12T22:35:00.176Z",
      "action_time_millis" : 1613169300176,
      "step" : "check-migration",
      "step_time" : "2021-02-12T22:35:17.987Z",
      "step_time_millis" : 1613169317987,
      "step_info" : {
        "message" : "Waiting for all shard copies to be active",
        "shards_left_to_allocate" : -1,
        "all_shards_active" : false,
        "number_of_replicas" : 1
      },
      "phase_execution" : {
        "policy" : "apm-rollover-30-days",
        "phase_definition" : {
          "min_age" : "14d",
          "actions" : {
            "readonly" : { },
            "set_priority" : {
              "priority" : 50
            }
          }
        },
        "version" : 3,
        "modified_date" : "2021-05-27T23:52:01.125Z",
        "modified_date_in_millis" : 1622159521125
      }
    }
  }
}

Then I run this query:

GET _cluster/allocation/explain

And I received this response:

{
  "index" : "apm-7.10.0-span-000009",
  "shard" : 0,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "CLUSTER_RECOVERED",
    "at" : "2021-07-06T16:01:24.109Z",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "jwbdx5IzTfeqlA3r7Rnmlg",
      "node_name" : "b6be8b05ea96",
      "transport_address" : "172.18.0.4:9300",
      "node_attributes" : {
        "ml.machine_memory" : "3221225472",
        "xpack.installed" : "true",
        "transform.node" : "true",
        "ml.max_open_jobs" : "20"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "a copy of this shard is already allocated to this node [[apm-7.10.0-span-000009][0], node[jwbdx5IzTfeqlA3r7Rnmlg], [P], s[STARTED], a[id=RF-1z_7OQxeixisbtoe06Q]]"
        }
      ]
    }
  ]
}

Can you help me understand what I am looking at, and why my index lifecycle policy is not working? Can it be related to not having enough disk space for all the shards?

Thanks!

spinscale · July 14, 2021, 8:41am

Hey,

very high level guess from the data: are you running a single node, but have shards configured to have a replica?

--Alex

AdamZ · July 14, 2021, 9:26am

Indeed that's the case! I wasn't aware that there can be only one shard copy per node. Decider explanation makes sense now.

I think in my case it will be better to change settings to have no replica shards. Will changing the configuration of indexes be enough to solve the issue, or should I do something else to handle those unallocated shards?

spinscale · July 14, 2021, 3:26pm

changing the configuration in the indices is enough to then activate the ILM policy. However you probably would like to have future index creations also work, then you would need to adapt the index template for the number of replicas.

system · August 11, 2021, 3:27pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ILM waiting for allocation Elasticsearch ilm-index-lifecycle-management	5	3972	May 22, 2019
ILM Policy getting stuck here Elasticsearch ilm-index-lifecycle-management	2	396	July 31, 2023
ILM Waiting for Allocation on Warm Node Elasticsearch ilm-index-lifecycle-management	7	2929	January 2, 2020
ILM action status - Waiting for all shard copies to be active Elasticsearch ilm-index-lifecycle-management	9	3872	July 7, 2021
ILM rollover problem Elasticsearch ilm-index-lifecycle-management	0	20	October 16, 2024

ILM stalled by waiting for shard to be active

Related topics