Unassigned replica

Hi, I have unassigned replicas in my cluster. I try allocating them manually but I can't:

POST _cluster/reroute?pretty
{
  "commands": [
    {
      "allocate_replica": {
        "index": "apm-7.6.2-transaction-2020.05.20-000001",
        "shard": 1,
        "node": "s-monitoring-es-warm-00"
      }
    }
  ]
}    

Result:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "[allocate_replica] allocation of [apm-7.6.2-transaction-2020.05.20-000021][1] on node {s-monitoring-es-warm-00}{hpjwIe2GRs-bqm3yAjCy1Q}{6LM4LhW2TtebgZUAOl7gDA}{10.194.112.6}{10.194.112.6:9300}{dil}{ml.machine_memory=12591763456, ml.max_open_jobs=20, xpack.installed=true, data=warm} is not allowed, reason: [YES(shard has no previous failures)][YES(primary shard for this replica is already active)][YES(explicitly ignoring any disabling of allocation due to manual allocation commands via the reroute API)][YES(can allocate replica shard to a node with version [7.6.2] since this is equal-or-newer than the primary version [7.6.2])][YES(the shard is not being snapshotted)][YES(ignored as shard is not being recovered from a snapshot)][NO(node does not match index setting [index.routing.allocation.require] filters [data:\"warm\",_id:\"0qZ83lt-RsufPTuk0eD_aA\"])][YES(the shard does not exist on the same node)][YES(enough disk for shard on node, free: [1.4tb], shard size: [0b], free after allocating shard: [1.4tb])][YES(below shard recovery limit of outgoing: [0 < 24] incoming: [0 < 24])][YES(total shard limits are disabled: [index: -1, cluster: -1] <= 0)][YES(allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it)]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "[allocate_replica] allocation of [apm-7.6.2-transaction-2020.05.20-000021][1] on node {s-monitoring-es-warm-00}{hpjwIe2GRs-bqm3yAjCy1Q}{6LM4LhW2TtebgZUAOl7gDA}{10.194.112.6}{10.194.112.6:9300}{dil}{ml.machine_memory=12591763456, ml.max_open_jobs=20, xpack.installed=true, data=warm} is not allowed, reason: [YES(shard has no previous failures)][YES(primary shard for this replica is already active)][YES(explicitly ignoring any disabling of allocation due to manual allocation commands via the reroute API)][YES(can allocate replica shard to a node with version [7.6.2] since this is equal-or-newer than the primary version [7.6.2])][YES(the shard is not being snapshotted)][YES(ignored as shard is not being recovered from a snapshot)][NO(node does not match index setting [index.routing.allocation.require] filters [data:\"warm\",_id:\"0qZ83lt-RsufPTuk0eD_aA\"])][YES(the shard does not exist on the same node)][YES(enough disk for shard on node, free: [1.4tb], shard size: [0b], free after allocating shard: [1.4tb])][YES(below shard recovery limit of outgoing: [0 < 24] incoming: [0 < 24])][YES(total shard limits are disabled: [index: -1, cluster: -1] <= 0)][YES(allocation awareness is not enabled, set cluster setting [cluster.routing.allocation.awareness.attributes] to enable it)]"
  },
  "status" : 400
  }

I don't have anything strange in my cluster settings:

{
  "persistent" : {
    "xpack" : {
      "monitoring" : {
        "collection" : {
          "enabled" : "true"
        }
      }
    }
  },
  "transient" : { }
}

I also try setting number of replicas to 0 and then to 1 again, but it's the same.

Can someone help me?

The clue is here:

You have allocation filters on this index which cannot be satisfied.

Yes..thanks!

In the index settings, I see: "routing.allocation.require._id" : "0qZ83lt-RsufPTuk0eD_aA".
That node doesn't have free space on disk, I think the error is related to that.

But I can't understand why? Why does ES want to put all replica shards in the same node..?
Is this expected?

Recently I've enabled shrink option in my warm phase, could it be related to the shrink process?

Yes that would explain it, the shrink process requires all the shards of an index to be assigned to a single node.

Ok... So, this is what I think happened:

  1. ES moves the index (1primary, 1 replica) to warm allocating it in different nodes
  2. A specific node is assigned to the index settings (routing.allocation.require._id)
  3. The master moves a copy of each shard into that node
  4. The node which contains the other copies falls
  5. ES try to reallocate the unassigned shards but it's not possible due to the node in routing.allocation.require._id already containing a copy of each shard.
  6. The cluster turns to yellow
  7. Due to the cluster is not green, the shrink process don't run never.
  8. The index stays with the property routing.allocation.require._id, so the status of the cluster is yellow for ever at least you fix it manually.

I think this could be a little difficult to manage in a big cluster (+20 nodes), but I don't know how to prevent this case.

I don't think that's quite right, I think the node that failed is the one that was chosen as the location for the shrink operation. I think https://github.com/elastic/elasticsearch/pull/52077 addresses this by allowing Elasticsearch to retry if the target node goes away.

Anyway, with unassigned shards the node is yellow, so the shrink process isn't going to run, right?

From here: Shrink index API | Elasticsearch Guide [8.11] | Elastic

The cluster health status must be green.

If the shrink process don't finish, the index stays with the property routing.allocation.require._id, so the status of the cluster is yellow for ever because the unassigned shards will remain unassigned for ever.

Hmm I'm not sure why that's there in the docs, green health is certainly not required for shrinking.

Ok, I don't know so.. but I can say something that I checked: while the cluster was yellow, with unassigned shards, a whole version of my index already did be in one warm node, however the shrink process didn't run..

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.