failed to restore snapshot - IndexShardRestoreFailedException file not found


(Arun Prakash) #1

Hi,
I am using 3 node cluster setup with the elasticsearch 1.3.1, i have 17 indices each one is having min 2L documents and 14L max. now i would like to try the snapshot and restore process in my cluster. i used the following REST calls to do the same...

To create a repository:
curl -XPUT 'http://host.name:9200/_snapshot/es_snapshot_repo' -d '{
"type": "fs",
"settings": {
"location": "/data/es_snapshot_bkup_repo/es_snapshot_repo"
}
}'

Verified the repository:
curl -XGET 'http://host.name:9200/_snapshot/es_snapshot_repo?pretty' the response is
{
"es_snapshot_repo" : {
"type" : "fs",
"settings" : {
"location" : "/data/es_snapshot_bkup_repo/es_snapshot_repo"
}
}
}

done the SNAPSHOT using
curl -XPUT "http://host.name:9200/_snapshot/es_snapshot_repo/snap_001" -d '{"indices": "index_01","ignore_unavailable": "true","include_global_state": false,"wait_for_completion": true}'

the response is {"accepted":true}

then i am trying to restore the snapshot by the request
curl -XPOST "http://host.name:9200/_snapshot/es_snapshot_repo/snap_001/_restore" -d '{
"indices": "index_01",
"ignore_unavailable": "true",
"include_global_state": false,
"rename_pattern": "index_01",
"rename_replacement": "index_01_bk","include_aliases": false}'

ISSUE:
As i informed i have 3 nodes. the index which i am trying to take snapshot & restore is has 6 shards and 2 replicas.

Most of the shards restored properly, but 1 or 2 nodes are not restoring those are in the INITIALIZING state and i left it for more than an hour but those shards are not relocating to the exact correct node... i got the following exception in my node.

[2014-08-27 07:10:35,492][DEBUG][cluster.service ] [node_01] processing [shard-failed ([snap_001][4], node[r4UoA7vJREmQfh6lz634NA], [P], restoring[es_snapshot_repo:snap_001], s[INITIALIZING]), reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[snap_001][4] failed recovery]; nested: IndexShardRestoreFailedException[[snap_001][4] restore failed]; nested: IndexShardRestoreFailedException[[snap_001][4] failed to restore snapshot [snap_001]]; nested: IndexShardRestoreFailedException[[snap_001][4] failed to read shard snapshot file]; nested: FileNotFoundException[/data/es_snapshot_bkup_repo/es_snapshot_repo/indices/index_01/4/snapshot-snap_001 (No such file or directory)]; ]]]: done applying updated cluster_state (version: 56391)

Could any one help me to overcome this issue.. and please correct me if i done any mistake in thease process.. .

FYI i am using master node to pass the curl request

Thanks in advance


(Arun Prakash) #2

Issue is fixed

i made a mistake in the fileSystem location. actually we need to point a shared file system. i gave a local location of each node. Now i changed the location to an shared mount folder [which is accessible by all nodes] and this issue is fixed

Thanks


(system) #3