Why won't my Shards Assign?!


(G T) #1

I have a cluster with two Graylog nodes and three elasticsearch nodes. I have 4 primaries and 1 replica per index.

I have recently been testing resillience by running 'service elasticsearch stop' to stop one of the ES nodes, when I check unassigned shards using: curl -s 'http://localhost:9200/_cat/shards?pretty' | grep UNASSIGNED it returns a third of the shards being unassigned.

I chose one at random and ran the curl command:

http://localhost:9200/_cluster/allocation/explain?include_yes_decisions=true' -d '{
"index": "graylog_0",
"shard": 3,
"primary": true
}'

I got the following errors:

  "index": "graylog_0",
  "shard": 3,
  "primary": true
}'
{
  "index" : "graylog_0",
  "shard" : 3,
  "primary" : true,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "NODE_LEFT",
    "at" : "2018-11-23T13:52:36.979Z",
    "details" : "node_left[x9ohTd5uRyyK9kHhzIlIzg]",
    "last_allocation_status" : "no_valid_shard_copy"
  },
  "can_allocate" : "no_valid_shard_copy",
  "allocate_explanation" : "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster",
  "node_allocation_decisions" : [
    {
      "node_id" : "D3wazOlhQ3-C36kepICG7w",
      "node_name" : "es-1",
      "transport_address" : "10.19.0.6:9300",
      "node_decision" : "no",
      "store" : {
        "found" : false
      }
    },
    {
      "node_id" : "LUbM7G9RQL2L-lGxfO7SIQ",
      "node_name" : "es-2",
      "transport_address" : "10.19.0.8:9300",
      "node_decision" : "no",
      "store" : {
        "found" : false
      }
    }
  ]
}

Surely if a node leaves, the primary shards are moved onto another node or the replica of the said shard is made primary?

Any ideas?

Cheers,

G


(G T) #2

I found the issue. Half of the indices were created without replicas and half with, so when one node goes offline it takes down some shards that have no replicas, therefore the data doesn't exist in the cluster anymore.

Does anyone know how to reindex all indices?

Cheers,

G


(David Turner) #3

If this is to solve the problem of having indices with no replicas, you can simply set the number of replicas to 1 on those indices, no need to reindex anything. To set every index to have one replica:

PUT /_settings
{
  "number_of_replicas": 1
}

To just apply it to a specific index, indexname,

PUT /indexname/_settings
{
  "number_of_replicas": 1
}