Unassigned shards on ES 2.4.4

I'm using an ES 2.4.4.

Currently, I'm getting this content response at _cluster/health?level=nodes

{
   "cluster_name":"logging-es",
   "status":"red",
   "timed_out":false,
   "number_of_nodes":1,
   "number_of_data_nodes":1,
   "active_primary_shards":62,
   "active_shards":62,
   "relocating_shards":0,
   "initializing_shards":0,
   "unassigned_shards":2,
   "delayed_unassigned_shards":0,
   "number_of_pending_tasks":0,
   "number_of_in_flight_fetch":0,
   "task_max_waiting_in_queue_millis":0,
   "active_shards_percent_as_number":96.875
}

As you can see, cluster has two shards unassigned: "unassigned_shards":2.

Related indices are:

$ _cat/indices -s | grep red
red   open project.istio-project-two.242ef609-8f... 1 0                           
red   open project.3scale-amp-2.08baf3f...      1 0

These are the unassigned shards:

$ _cat/shards | grep UNASSIGNED
project.istio-project-two.242ef609-8f... 0 p UNASSIGNED                                                            
project.3scale-amp-2.08baf3f5-c3...      0 p UNASSIGNED

I don't know which is the reason ES is running in this issue.

Every day I delete these ones and then are created again. However, every day morning I need to apply this workaround I don't feel confortable.

The number of nodes:

$ _cat/nodes  
10.128.2.21 10.128.2.21 67 99 0.41 d * logging-es-data-master-yl7ddqw7

Index settings:

$GET /project.istio-project-two.242ef609-8f/_settings?pretty
{
  "project.istio-project-two.242ef609-8f..." : {
    "settings" : {
      "index" : {
        "creation_date" : "1532649602432",
        "refresh_interval" : "5s",
        "number_of_shards" : "1",
        "number_of_replicas" : "0",
        "uuid" : "XY-tFCkhQVOLlXmndoKpSQ",
        "version" : {
          "created" : "2040499"
        }
      }
    }
  }
}

Any ideas?

This could be caused by dangling indices. Check your logs on the nodes to see if there are any warnings about dangling indices.

I've took a look but I¡ve not been able to found any related issue about "dangling"

Can you provide the routing table for that particular index with the unassigned shards?

GET /_cluster/state/routing_table/your_index_name_that_has_unassigned_shards

Note that in ES 5.x and above we have the allocation explain API that provides a lot of details on why a shard is unassigned. So maybe that's also a good motivation to upgrade your cluster? :slight_smile:

Here it goes. It seems to provide more related information:

$ curl --key /etc/elasticsearch/secret/admin-key --cert /etc/elasticsearch/secret/admin-cert --caouting_table/project.istio-project.0427311d-8f11-11e8-9139-0050569fe304.2018.07.30?pretty -s
{
  "cluster_name" : "logging-es",
  "routing_table" : {
    "indices" : {
      "project.istio-project.0427311d-8f11-11e8-9139-0050569fe304.2018.07.30" : {
        "shards" : {
          "0" : [ {
            "state" : "UNASSIGNED",
            "primary" : true,
            "node" : null,
            "relocating_node" : null,
            "shard" : 0,
            "index" : "project.istio-project.0427311d-8f11-11e8-9139-0050569fe304.2018.07.30",
            "version" : 3,
            "unassigned_info" : {
              "reason" : "ALLOCATION_FAILED",
              "at" : "2018-07-30T02:02:29.633Z",
              "details" : "engine failure, reason [refresh failed], failure CorruptIndexException[codec footer mismatch (file truncated?): actual footer=135071757 vs expected footer=-1071082520 (resource=NIOFSIndexInput(path=\"/elasticsearch/persistent/logging-es/data/logging-es/nodes/0/indices/project.istio-project.0427311d-8f11-11e8-9139-0050569fe304.2018.07.30/0/index/_12v.cfs\") [slice=_12v_Lucene50_0.pos])]"
            }
          } ]
        }
      }
    }
  }
}

this looks like disk corruption. If you have backups, now's a good time to restore them.

Could you suggest me any way to dig into this issue?

I think it's an issue related with glusterFS. We are mounting ES data persistence using a GlusterFS volume.

I would like to dig a bit more into this issue.

Could you suggest me any strategy no solve that?

Any ideas will be welcome.

I would recommend avoiding GlusterFS as it is not supported and there are known issues. See this thread for further details.