After restarting ElasticSearch, a lot of shards unassigned


#1

The elasticsearch 2.0.0 is running for several months already and creating logstash index per day. When I want to update the elasticsearch.yml, i try to restart it first without any changes and the the problem occurs.

It assigned the first few hundreds then suddenly stopped at around 855 and no more updates happened.

[ec2-user@ip-xx-xx-xx-xx ~]$ curl 'xx-xx-xx-xx:9200/_cat/health?v'
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1455684626 12:50:26 elasticsearch red 1 1 855 855 0 4 3693 6 315.2ms 18.8%
[ec2-user@ip-xx-xx-xx-xx ~]$ curl 'xx-xx-xx-xx:9200/_cat/allocation?v'
shards disk.used disk.avail disk.total disk.percent host ip node
859 7.1gb 2.6gb 9.8gb 72 xx-xx-xx-xx xx-xx-xx-xx node1
3693 UNASSIGNED

I tried to search the logs and found some errors
[2016-02-17 14:21:41,743][WARN ][cluster.action.shard ] [node1] [logstash_vcm_apilog-2016.02.16][4] received shard failed for [logstash_vcm_apilog-2016.02.16][4], node[NKZ9SbU9S8Cf5jXy5GVMgw], [P], v[20], s[INITIALIZING], a[id=a-1W4A7rTl-W35S4UsOohw], unassigned_info[[reason=ALLOCATION_FAILED], at[2016-02-17T06:21:38.627Z], details[failed recovery, failure IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: FileAlreadyExistsException[/var/lib/elasticsearch/elasticsearch/nodes/0/indices/logstash_vcm_apilog-2016.02.16/4/translog/translog-65.ckp]; ]], indexUUID [szBbNKdQQaej-_kpts4eMg], message [failed recovery], failure [IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: FileAlreadyExistsException[/var/lib/elasticsearch/elasticsearch/nodes/0/indices/logstash_vcm_apilog-2016.02.16/4/translog/translog-65.ckp]; ]

Any idea how to recover the elasticsearch?


(system) #2