Unassigned Shards with allocation_status deciders_no

I'm restoring my data from S3 to a different cluster and some of my indexes are stuck in the UNASSIGNED state with an allocation_status of deciders_no.

1. Verify Snapshot

It appears to be valid and some of the indexes do restore correctly.

GET /_snapshot/s3_repository/for_production_4-21e/

{
  "snapshots": [
    {
      "snapshot": "4-21e",
      "uuid": "4s2-PIUaRHKtFUeOKvrZhw",
      "version_id": 5020299,
      "version": "5.2.2",
      "indices": [
        "sightings-2016-02-01",
        "places"
      ],
      "state": "SUCCESS",
      "start_time": "2017-04-21T15:57:54.189Z",
      "start_time_in_millis": 1492790274189,
      "end_time": "2017-04-21T16:01:17.127Z",
      "end_time_in_millis": 1492790477127,
      "duration_in_millis": 202938,
      "failures": [],
      "shards": {
        "total": 395,
        "failed": 0,
        "successful": 395
      }
    }
  ]
}

2. Restoring from S3 to a different cluster

POST /_snapshot/s3_repository/for_production_4-21e/_restore?wait_for_completion=false
{
  "indices": "sightings-2016-02-01,places",
  "ignore_unavailable": true,
  "include_global_state": true,
  "index_settings": {
    "number_of_replicas": 0
  }
}

But the master shard for this index is stuck in the unassigned state. The

{
  "state": "UNASSIGNED",
  "primary": true,
  "node": null,
  "relocating_node": null,
  "shard": 0,
  "index": "sightings-2016-02-01",
  "recovery_source": {
    "type": "SNAPSHOT",
    "repository": "s3_repository",
    "snapshot": "for_production_4-21e",
    "version": "5.2.2",
    "index": "sightings-2016-02-01"
  },
  "unassigned_info": {
    "reason": "NEW_INDEX_RESTORED",
    "at": "2017-04-21T16:51:26.260Z",
    "delayed": false,
    "details": "restore_source[s3_repository/for_production_4-21e]",
    "allocation_status": "deciders_no"
  }
}

GET /_cluster/allocation/explain

Looking at this route shows the index is looking for a node with the _id of nSGRGqb-RUmgQkW-jbZ7QA OR 82GQkoQ5Q5yftFg_g4Qpvg. But how do we fix this?

"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "voWCoe9BQuiYxNu-jt1c2A",
      "node_name": "voWCoe9",
      "transport_address": "10.1.189.13:9300",
      "node_decision": "no",
      "weight_ranking": 1,
      "deciders": [
        {
          "decider": "filter",
          "decision": "NO",
          "explanation": """initial allocation of the index is only allowed on nodes [_id:"nSGRGqb-RUmgQkW-jbZ7QA OR 82GQkoQ5Q5yftFg_g4Qpvg"]"""

Does the restored index have any allocation filtering in its settings? It it does you can set the filtering to null or empty string.

Hi @nik9000,

I tried _name and _ip. I'm thinking these indexes are not restoring because they were created with the shrink api.

PUT sightings-2016-02-01/_settings
{
  "index.routing.allocation.include._ip": "10.*"
}

GET sightings-2016-02-01/_settings

{
  "sightings-2016-02-01": {
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_ip": "10.*"
            },
            "initial_recovery": {
              "_id": "nSGRGqb-RUmgQkW-jbZ7QA,82GQkoQ5Q5yftFg_g4Qpvg"
            }
          }
        },
        "allocation": {
          "max_retries": "1"
        },
        "number_of_shards": "1",
        "shrink": {
          "source": {
            "name": "bulk-sightings-2016-02-01",
            "uuid": "8arcuERNTlKYE9fEZ9ZNkQ"
          }
        },
        "provided_name": "sightings-2016-02-01",
        "creation_date": "1490916922072",
        "number_of_replicas": "0",
        "uuid": "hmnvNW-eQum3wihXbtRDhw",
        "version": {
          "created": "5020299",
          "upgraded": "5020299"
        }
      }
    }
  }
}

I think the problem is by allocation.initial_recovery._id because it was the same value on the previous cluster.

"sightings-2016-02-01": {
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "initial_recovery": {
              "_id": "nSGRGqb-RUmgQkW-jbZ7QA,82GQkoQ5Q5yftFg_g4Qpvg"

I've

GET /_cluster/allocation/explain
{
  "index": "sightings-2016-02-01",
  "shard": 0,
  "primary": true
}

Response

{
  "index": "sightings-2016-02-01",
  "shard": 0,
  "primary": true,
  "current_state": "unassigned",
  "unassigned_info": {
    "reason": "NEW_INDEX_RESTORED",
    "at": "2017-04-21T17:48:23.476Z",
    "details": "restore_source[s3_repository/for_production_4-21e]",
    "last_allocation_status": "no"
  },
  "can_allocate": "no",
  "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "voWCoe9BQuiYxNu-jt1c2A",
      "node_name": "voWCoe9",
      "transport_address": "10.1.189.13:9300",
      "node_decision": "no",
      "weight_ranking": 1,
      "deciders": [
        {
          "decider": "filter",
          "decision": "NO",
          "explanation": """initial allocation of the index is only allowed on nodes [_id:"nSGRGqb-RUmgQkW-jbZ7QA OR 82GQkoQ5Q5yftFg_g4Qpvg"]"""
        }
      ]
    },
    {
      "node_id": "ZNbmCrO6RoaVFQtsdm-R1w",
      "node_name": "ZNbmCrO",
      "transport_address": "10.1.190.226:9300",
      "node_decision": "no",
      "weight_ranking": 2,
      "deciders": [
        {
          "decider": "filter",
          "decision": "NO",
          "explanation": """initial allocation of the index is only allowed on nodes [_id:"nSGRGqb-RUmgQkW-jbZ7QA OR 82GQkoQ5Q5yftFg_g4Qpvg"]"""
        }
      ]
    },
    {
      "node_id": "ngwroDfyR2urGlJ4UEZvEw",
      "node_name": "ngwroDf",
      "transport_address": "10.1.190.21:9300",
      "node_decision": "no",
      "weight_ranking": 3,
      "deciders": [
        {
          "decider": "filter",
          "decision": "NO",
          "explanation": """initial allocation of the index is only allowed on nodes [_id:"nSGRGqb-RUmgQkW-jbZ7QA OR 82GQkoQ5Q5yftFg_g4Qpvg"]"""
        }
      ]
    },
    {
      "node_id": "u5tXF4qQQb2YZxp0hJsFFg",
      "node_name": "u5tXF4q",
      "transport_address": "10.1.189.248:9300",
      "node_decision": "no",
      "weight_ranking": 4,
      "deciders": [
        {
          "decider": "filter",
          "decision": "NO",
          "explanation": """initial allocation of the index is only allowed on nodes [_id:"nSGRGqb-RUmgQkW-jbZ7QA OR 82GQkoQ5Q5yftFg_g4Qpvg"]"""
        }
      ]
    }
  ]
}

Can you try setting that initial_recovery option to null? That should clear it.

No it returns an error. I've tried setting the parents objects to null as well.

PUT sightings-2016-02-01/_settings
{
  "index.routing.allocation.initial_recovery._id": null,
  "index.allocation.max_retries": 5
}

{
  "error": {
    "root_cause": [
      {
        "type": "remote_transport_exception",
        "reason": "[v_Rt1wM][10.1.189.53:9300][indices:admin/settings/update]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "unknown setting [index.routing.allocation.initial_recovery._id] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
  },
  "status": 400
}

Let me see if I can reproduce locally. I'll play and get back to you.

Thanks! I found this issue for [initial_recover limits replica allocation.] (https://github.com/elastic/elasticsearch/pull/20589)

I guess I'm going to dig through the elasticsearch source code.

OK - I just reproduced this locally against master. I'm fairly sure you can't restore shrunken indices now.

What are my options for getting this data from our QA cluster to production cluster? Reindexing isn't an options since we're in the 2TB range.

I'm not sure! Still investigating. There may not be any good options.

Are you an elasticsearch employee? I can write an issue in GitHub if you have not already doing it.

I'm doing it now, yeah. Just getting easy reproduction steps.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.