Cannot allocate because allocation is not permitted to any of the nodes that hold an in-sync shard copy

elastic78 · November 3, 2022, 6:18am

ES version - 6.x
ES cluster is red and getting the below error

sh-4.2# curl -k -XGET https://127.0.0.1:9200/_cluster/allocation/explain?pretty^M
{^H
  "index" : "index1",^M
  "shard" : 2,^M
  "primary" : true,^M
  "current_state" : "unassigned",^M
  "unassigned_info" : {^M
    "reason" : "ALLOCATION_FAILED",^M
    "at" : "2022-09-28T06:08:02.154Z",^M
    "failed_allocation_attempts" : 5,^M
    "details" : "failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[index1][2]: obtaining shard lock timed out after 5000ms]; ",^M
    "last_allocation_status" : "no"^M
  },^M
  "can_allocate" : "no",^M
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes that hold an in-sync shard copy",^M
  "node_allocation_decisions" : [^M
    {^M
      "node_id" : "K_nIcsssdssRvQIagrsf2QLkfIQ",^M
      "node_name" : "K_nIcsR",^M
      "transport_address" : "localhost:9300",^M
      "node_decision" : "no",^M
      "store" : {^M
        "in_sync" : false,^M
        "allocation_id" : "PW4oAHGAT9KLvL24_GEjSQ"^M
      }^M
    },^M
    {^M
      "node_id" : "uOPt4GKBsfsfSsyuVLVu-IRZ-g",^M
      "node_name" : "uOPtsff4GK",^M
      "transport_address" : "localhost:9300",^M
      "node_decision" : "no",^M
      "store" : {^M
        "in_sync" : true,^M
        "allocation_id" : "sNdzxssfsfsTK4SV6PPz16z6gA4Q"^M
      },^M
      "deciders" : [^M
        {^M
          "decider" : "max_retry",^M
          "decision" : "NO",^M
          "explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2022-09-28T06:08:02.154Z], failed_attempts[5], delayed=false, details[failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[ise][2]: obtaining shard lock timed out after 5000ms]; ], allocation_status[deciders_no]]]"^M
        }^M
      ]^M
    }^M
  ]^M```

Any reason what can be the cause for this issue, we are not able to reproduce this in all our setups, only some setups of them are having this issue.I saw couple forums which suggested to try reroute and increase max tries.  My question is once we set reroute to true and max retries to 15; will it the change be there always and when ever there is sync issue after 15 retries will reroute automatically happen beacuse I see that they are telling manually we have to do everytime. Please clarify this for me. Below is what I am planning to suggest.

curl -XPOST 'localhost:9200/_cluster/reroute?retry_failed’
curl --silent --request PUT --header 'Content-Type: application/json' 127.0.0.1:9200/ise/_settings?pretty=true --data-ascii '{
"index": {
"allocation": {
"max_retries": 15
}
}
}'

Thanks

warkolm · November 3, 2022, 6:35am

What is the output from the _cluster/stats?pretty&human API?

Please upgrade, 6.X is very much past EOL and no longer supported.

elastic78 · November 3, 2022, 6:26pm

Below is the stats, yes we are planning to upgrade in the next release but we have customers who are still in the older verions and we support those versions. Can you please check if you can help anything here

curl -k -XGET 'https://localhost:9200/_cluster/stats?pretty'
{
  "_nodes" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "cluster_name" : "ise-elasticsearch",
  "timestamp" : 1667499489816,
  "status" : "green",
  "indices" : {
    "count" : 56,
    "shards" : {
      "total" : 560,
      "primaries" : 280,
      "replication" : 1.0,
      "index" : {
        "shards" : {
          "min" : 10,
          "max" : 10,
          "avg" : 10.0
        },
        "primaries" : {
          "min" : 5,
          "max" : 5,
          "avg" : 5.0
        },
        "replication" : {
          "min" : 1.0,
          "max" : 1.0,
          "avg" : 1.0
        }
      }
    },
    "docs" : {
      "count" : 15547,
      "deleted" : 2
    },
    "store" : {
      "size_in_bytes" : 14312098,
      "throttle_time_in_millis" : 0
    },
    "fielddata" : {
      "memory_size_in_bytes" : 1016360,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 0,
      "total_count" : 0,
      "hit_count" : 0,
      "miss_count" : 0,
      "cache_size" : 0,
      "cache_count" : 0,
      "evictions" : 0
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 48,
      "memory_in_bytes" : 549044,
      "terms_memory_in_bytes" : 430552,
      "stored_fields_memory_in_bytes" : 20384,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 88448,
      "points_memory_in_bytes" : 84,
      "doc_values_memory_in_bytes" : 9576,
      "index_writer_memory_in_bytes" : 0,
      "version_map_memory_in_bytes" : 0,
      "fixed_bit_set_memory_in_bytes" : 6000,
      "max_unsafe_auto_id_timestamp" : -1,
      "file_sizes" : { }
    }
  },
  "nodes" : {
    "count" : {
      "total" : 2,
      "data" : 2,
      "coordinating_only" : 0,
      "master" : 2,
      "ingest" : 2
    },
    "versions" : [
      "5.5.2"
    ],
    "os" : {
      "available_processors" : 2,
      "allocated_processors" : 2,
      "names" : [
        {
          "name" : "Linux",
          "count" : 2
        }
      ],
      "mem" : {
        "total_in_bytes" : 33275101184,
        "free_in_bytes" : 1456717824,
        "used_in_bytes" : 31818383360,
        "free_percent" : 4,
        "used_percent" : 96
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 0
      },
      "open_file_descriptors" : {
        "min" : 726,
        "max" : 728,
        "avg" : 727
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 2594825923,
      "versions" : [
        {
          "version" : "1.8.0_292",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "25.292-b10",
          "vm_vendor" : "Red Hat, Inc.",
          "count" : 2
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 1019559888,
        "heap_max_in_bytes" : 2130051072
      },
      "threads" : 61
    },
    "fs" : {
      "total_in_bytes" : 1204824801280,
      "free_in_bytes" : 1111064612864,
      "available_in_bytes" : 1049815617536,
      "spins" : "true"
    },
    "plugins" : [
      {
        "name" : "SSLPlugin",
        "version" : "1.0",
        "description" : "SSL Plugin desc",
        "classname" : "org.elasticsearch.plugin.ssl.SSLPlugin",
        "has_native_controller" : false
      }
    ],
    "network_types" : {
      "transport_types" : {
        "nodeTransportModule" : 2
      },
      "http_types" : {
        "httpServerModule" : 2
      }
    }
  }
}```

warkolm · November 3, 2022, 10:03pm

This is a positively ancient version that has been EOL for years. You will struggle to get any support sorry to say.

system · December 1, 2022, 10:04pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Allocation Error Elasticsearch	4	13002	June 19, 2017
Red Cluster State: failed to create shard, failure IOException[failed to obtain in-memory shard lock] Elasticsearch	1	531	September 15, 2020
ES Cluster State Red - cannot allocate because allocation is not permitted to any of the nodes that hold an in-sync shard copy Elasticsearch	5	27665	September 21, 2017
Shard Allocation Failures After 5 Retries Elasticsearch	3	1441	July 26, 2021
Cluster Red - unallocated shards in a index Elasticsearch	3	1860	January 30, 2018

Cannot allocate because allocation is not permitted to any of the nodes that hold an in-sync shard copy

Related topics