Shard Allocation Failed - Even Manually

Hey,

a few weeks ago we hit the 10k shard limit in our cluster, the problem is solved now but since then we have problems with 256 replica shards which are unassigned. It is also not possible the reassign the shards, see some details below, hope someone can help us here.

Example for July 14th:

GET _cat/shards/logstash-operations-2020.07.14?h=index,shard,prirep,state,unassigned.reason,node

logstash-operations-2020.07.14 4 p STARTED              elastic01-warm
logstash-operations-2020.07.14 4 r STARTED              elastic05-warm
logstash-operations-2020.07.14 4 r UNASSIGNED NODE_LEFT 
logstash-operations-2020.07.14 2 r STARTED              elastic01-warm
logstash-operations-2020.07.14 2 r STARTED              elastic02-warm
logstash-operations-2020.07.14 2 p STARTED              elastic04-warm
logstash-operations-2020.07.14 1 r STARTED              elastic01-warm
logstash-operations-2020.07.14 1 p STARTED              elastic04-warm
logstash-operations-2020.07.14 1 r STARTED              elastic05-warm
logstash-operations-2020.07.14 3 p STARTED              elastic01-warm
logstash-operations-2020.07.14 3 r STARTED              elastic02-warm
logstash-operations-2020.07.14 3 r UNASSIGNED NODE_LEFT 
logstash-operations-2020.07.14 0 r STARTED              elastic01-warm
logstash-operations-2020.07.14 0 p STARTED              elastic02-warm
logstash-operations-2020.07.14 0 r UNASSIGNED NODE_LEFT 

GET logstash-operations-2020.07.14/_settings

{
  "logstash-operations-2020.07.14" : {
    "settings" : {
      "index" : {
        "lifecycle" : {
          "name" : "move_to_warm_after_14d"
        },
        "routing" : {
          "allocation" : {
            "require" : {
              "_id" : "mnioyEK8R2iz5rQErRb9Jw",
              "box_type" : "warm"
            }
          }
        },
        "mapping" : {
          "ignore_malformed" : "true"
        },
        "refresh_interval" : "5s",
        "number_of_shards" : "5",
        "blocks" : {
          "write" : "true"
        },
        "provided_name" : "logstash-operations-2020.07.14",
        "creation_date" : "1594684800601",
        "priority" : "50",
        "number_of_replicas" : "2",
        "uuid" : "_wcxTvkVSYCUqCh4qxbm3g",
        "version" : {
          "created" : "7080099"
        }
      }
    }
  }
}

GET _cat/nodeattrs?v

node           host        ip          attr              value
elastic03-warm 10.0.24.12  10.0.24.12  ml.machine_memory 67189415936
elastic03-warm 10.0.24.12  10.0.24.12  ml.max_open_jobs  20
elastic03-warm 10.0.24.12  10.0.24.12  xpack.installed   true
elastic03-warm 10.0.24.12  10.0.24.12  box_type          warm
elastic03-warm 10.0.24.12  10.0.24.12  transform.node    true

Trying to reallocate:

POST /_cluster/reroute?retry_failed=true
{
  "commands" : [
  {
    "allocate_replica" : {
       "index" : "logstash-operations-2020.07.14", "shard" : 4,
       "node" : "elastic03-warm"
     }
  }]
}

Response

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "[allocate_replica] allocation of [logstash-operations-2020.07.14][4] on node {elastic03-warm}{OEsq71lQTXOGnRVaSTnsFQ}{wM1bkwqoQB-5ozwJ8aEisg}{10.0.24.12}{10.0.24.12:9300}{dilrt}{ml.machine_memory=67189415936, ml.max_open_jobs=20, xpack.installed=true, box_type=warm, transform.node=true} is not allowed, reason: [YES(shard has no previous failures)][YES(primary shard for this replica is already active)][YES(explicitly ignoring any disabling of allocation due to manual allocation commands via the reroute API)][YES(can allocate replica shard to a node with version [7.8.0] since this is equal-or-newer than the primary version [7.8.0])][YES(the shard is not being snapshotted)][YES(ignored as shard is not being recovered from a snapshot)][NO(node does not match index setting [index.routing.allocation.require] filters [box_type:\"warm\",_id:\"mnioyEK8R2iz5rQErRb9Jw\"])][YES(this node does not hold a copy of this shard)][YES(enough disk for shard on node, free: [15.1tb], shard size: [4.9gb], free after allocating shard: [15.1tb])][YES(below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2])][YES(total shard limits are disabled: [index: -1, cluster: -1] <= 0)][YES(allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it)]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "[allocate_replica] allocation of [logstash-operations-2020.07.14][4] on node {elastic03-warm}{OEsq71lQTXOGnRVaSTnsFQ}{wM1bkwqoQB-5ozwJ8aEisg}{10.0.24.12}{10.0.24.12:9300}{dilrt}{ml.machine_memory=67189415936, ml.max_open_jobs=20, xpack.installed=true, box_type=warm, transform.node=true} is not allowed, reason: [YES(shard has no previous failures)][YES(primary shard for this replica is already active)][YES(explicitly ignoring any disabling of allocation due to manual allocation commands via the reroute API)][YES(can allocate replica shard to a node with version [7.8.0] since this is equal-or-newer than the primary version [7.8.0])][YES(the shard is not being snapshotted)][YES(ignored as shard is not being recovered from a snapshot)][NO(node does not match index setting [index.routing.allocation.require] filters [box_type:\"warm\",_id:\"mnioyEK8R2iz5rQErRb9Jw\"])][YES(this node does not hold a copy of this shard)][YES(enough disk for shard on node, free: [15.1tb], shard size: [4.9gb], free after allocating shard: [15.1tb])][YES(below shard recovery limit of outgoing: [0 < 2] incoming: [0 < 2])][YES(total shard limits are disabled: [index: -1, cluster: -1] <= 0)][YES(allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it)]"
  },
  "status" : 400
}

So the cluster obviously thinks that elastic03-warm does not match the filter criteria, which is not correct as you can see in the examples above.

If they are replicas, just drop them from the index and then re-add them.

The shard has a condition requiring a specific node id. Have you been shrinking this index?

Hm, it's not my intention to stick replica shards to a specific node id. But yes, you are right, the "move_to_warm_after_14d" ILM policy is configured to shrink indices to two primary shards:

  "move_to_warm_after_14d" : {
    "version" : 2,
    "modified_date" : "2020-07-22T10:20:16.925Z",
    "policy" : {
      "phases" : {
        "hot" : {
          "min_age" : "0ms",
          "actions" : {
            "set_priority" : {
              "priority" : 100
            }
          }
        },
        "warm" : {
          "min_age" : "14d",
          "actions" : {
            "allocate" : {
              "include" : { },
              "exclude" : { },
              "require" : {
                "box_type" : "warm"
              }
            },
            "forcemerge" : {
              "max_num_segments" : 1
            },
            "set_priority" : {
              "priority" : 50
            },
            "shrink" : {
              "number_of_shards" : 2
            }
          }
        }
      }
    }
  },

Until today shrinking has failed because the source index primary shard count was not a factor of the destination shard count (source was: 5 primary, destination 2 primary). I've reconfigured that by index creation 4 primary shards are used rather than 5. I think this should solve the problem that shrinking failed until now. But could this affect allocation of replica shards?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.