Shards are not allocating to available nodes

Hi, I'm running a 3 node elasticsearch 5.0.0 cluster and restarted a node in the cluster for some maintenance. Since that restart, there is a set of shards that seem to not be allocating on node with valid copies. I have tried restarting each node in the cluster, but each time a new subset of shards gets into the same state.

First I see the general state of the cluster and shards:

$ curl -s 'http://localhost:9200/_cluster/health?pretty'
{
  "cluster_name" : "demo",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 200,
  "active_shards" : 399,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 54,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 88.0794701986755
}

Then I get a list of unassigned shards:

$ curl -s 'http://localhost:9200/_cat/shards?pretty' | grep UNASSIGNED
...
54 lines
...

Next I take one of those shards and check the explanation API:

$ curl 'https://localhost:9200/_cluster/allocation/explain?pretty' -d '{
  "index": "foo",
  "shard": 0,
  "primary": true
}'

{
  "shard": {
    "index": "foo",
    "index_uuid": "idewF7BEQ9yVF1H9nfC6qg",
    "id": 0,
    "primary": true
  },
  "assigned": false,
  "shard_state_fetch_pending": false,
  "unassigned_info": {
    "reason": "ALLOCATION_FAILED",
    "at": "2017-03-29T20:05:35.309Z",
    "failed_attempts": 1,
    "delayed": false,
    "details": "master marked shard as active, but shard has not been created, mark shard as failed",
    "allocation_status": "no_valid_shard_copy"
  },
  "allocation_delay_in_millis": 60000,
  "remaining_delay_in_millis": 0,
  "nodes": {
    "CXxx0SNOTGeCXCrrD9XWvg": {
      "node_name": "foo",
      "node_attributes": {
      },
      "store": {
        "shard_copy": "STALE"
      },
      "final_decision": "NO",
      "final_explanation": "the copy of the shard is stale, allocation ids do not match",
      "weight": 8.099999,
      "decisions": []
    },
    "-5m_OHBRTva2Br3mZSAR7A": {
      "node_name": "bar",
      "node_attributes": {
      },
      "store": {
        "shard_copy": "NONE"
      },
      "final_decision": "NO",
      "final_explanation": "there is no copy of the shard available",
      "weight": 8.65,
      "decisions": []
    },
    "2aIJFALXS320jeBCPU36Dw": {
      "node_name": "baz",
      "node_attributes": {
      },
      "store": {
        "shard_copy": "AVAILABLE"
      },
      "final_decision": "YES",
      "final_explanation": "the shard can be assigned and the node contains a valid copy of the shard data",
      "weight": 8.65,
      "decisions": []
    }
  }
}

Note that the "unassigned_info" has failure info, while node "2aIJFALXS320jeBCPU36Dw" clearly says that is is ready and available.

I've tried requesting a retry for allocations with:

curl -s -X POST 'https://localhost:9200/_cluster/reroute?retry_failed=true&pretty'

...
a lot of state
...
"acknowledged": true

I've also tried requesting an allocate_stale_primary command on the node with the good data, with no results, but a positive "acknowledged".

I've also made sure the settings allow for full flexibility in allocations:

$ curl 'https://localhost:9200/_cluster/settings?pretty'
{
  "persistent" : { },
  "transient" : {
    "cluster" : {
      "routing" : {
        "allocation" : {
          "allow_rebalance" : "always",
          "enable" : "all"
        }
      }
    }
  }
}

EDIT: I've also made sure that each node is well below the threshold for allocations. Each is sitting at 12% disk usage.

Thanks for any advice

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.