Unhealthy cluster


#1

Hi, recently I've inherited elk stack which seems to be in unhealthy condition.
There's only one node with all unassigned shards for which I'll be thankful if someone can tell me how to recover.

Here's some info I've managed to get, but please ask if you need more:

# curl -XGET 'localhost:9200/_cat/shards?v'
index                           shard prirep state         docs   store ip        node
logs-2018.01.25     2     p      STARTED      86985  32.4mb 127.0.0.1 adC2nk3
logs-2018.01.25     2     r      UNASSIGNED                           
logs-2018.01.25     4     p      STARTED      88173    33mb 127.0.0.1 adC2nk3
logs-2018.01.25     4     r      UNASSIGNED                           
logs-2018.01.25     3     p      STARTED      87965  32.8mb 127.0.0.1 adC2nk3
logs-2018.01.25     3     r      UNASSIGNED                           
logs-2018.01.25     1     p      STARTED      87669  32.8mb 127.0.0.1 adC2nk3
logs-2018.01.25     1     r      UNASSIGNED                           
logs-2018.01.25     0     p      STARTED      87080  32.7mb 127.0.0.1 adC2nk3
logs-2018.01.25     0     r      UNASSIGNED                           
logs-2017.12.23     2     p      STARTED      11941   5.3mb 127.0.0.1 adC2nk3
logs-2017.12.23     2     r      UNASSIGNED                           
logs-2017.12.23     4     p      STARTED      12264   5.6mb 127.0.0.1 adC2nk3
logs-2017.12.23     4     r      UNASSIGNED                           
logs-2017.12.23     3     p      STARTED      11956   5.4mb 127.0.0.1 adC2nk3
logs-2017.12.23     3     r      UNASSIGNED                           
logs-2017.12.23     1     p      STARTED      12091   5.3mb 127.0.0.1 adC2nk3
logs-2017.12.23     1     r      UNASSIGNED                           
logs-2017.12.23     0     p      STARTED      12147   5.5mb 127.0.0.1 adC2nk3
logs-2017.12.23     0     r      UNASSIGNED                           
       *Truncated*

# curl -XGET 'localhost:9200/_cluster/allocation/explain?pretty'
{
  "index" : "logs-2018.01.25",
  "shard" : 2,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "CLUSTER_RECOVERED",
    "at" : "2018-05-15T08:28:39.545Z",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "adC2nk3KSPmL9c7zqiCiwA",
      "node_name" : "adC2nk3",
      "transport_address" : "127.0.0.1:9300",
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[logs-2018.01.25][2], node[abC2nk3KSPmL9c7zqiCiwA], [P], s[STARTED], a[id=dRx2RAW7RTaIaUxKym45Qg]]"
        }
      ]
    }
  ]
}


# curl -XGET 'localhost:9200/_cluster/health?pretty'
{
  "cluster_name" : "elasticsearch",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 2142,
  "active_shards" : 2142,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 2143,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 49.98833138856476
}

# curl -XGET 'localhost:9200/_cat/health?v'
epoch      timestamp cluster       status node.total node.data shards  pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1526387475 14:31:15  elasticsearch red             1         1   2142 2142    0    0     2143             0                  -                 50.0%

# curl -s 'localhost:9200/_cat/allocation?v'
shards disk.indices disk.used disk.avail disk.total disk.percent host      ip        node
  2142       42.2gb    56.1gb     93.8gb      150gb           37 127.0.0.1 127.0.0.1 adC2nk3
  2143                                                                               UNASSIGNED

(Christian Dahlqvist) #2

The fact that replica shards are unassigned is not a problem as you only have a single node. Elasticsearch will never allocate a replica to the same node that holds the primary.

You do however have a lot of very small shards, which is inefficient. Read this blog post for some practical guidance around sharding.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.