Hi there,
Before posting this message, I've searched the forum but didn't find anything relevant for the issue I've encountered. But if anyone thinks that the forum has a similar post, please point me to that post, otherwise please continue to read.
Unfortunately I couldn't find any solution to solve this issue, and frankly this is my last option.
This behavior has been noticed on different indices stored in the clusters and have no connection to the date or the size of the indices. I've seen failing indices from 2 days ago with 350KB in size and indices older than 20 days with 14GB in size.
If you need any other details beside those provided bellow, please let me know.
Issue: Elasticsearch fails to snapshot certain indices.
Cause: Unknown
Details:
ES Clusters:
A) 1 Hot node, 1 Warm node
B) 3 Hot nodes, 5 Warm nodes
Both clusters are running 1.6.2
Snapshot storage:
NFS mount accessible from all the nodes. Path /backups/*name-of-the-cluster ;
Available disk space on the backup server 5.2TB
Available inodes on the backup server 10485688991
Elasticsearch config:
path:
data: /usr/share/elasticsearch/data/staging201
repo: /backups/staging201
Command executed:
curl -XPUT http://localhost:9200/_snapshot/staging201/solrsearch-2016.09.26?wait_for_completion=true -d '{"indices":"solrsearch-2016.09.26", "ignore_unavailable": "true", "include_global_state": false}'
Output on the console:
{"snapshot":{"snapshot":"solrsearch-2016.09.26","indices":["solrsearch-2016.09.26"],"state":"PARTIAL","start_time":"2016-10-18T11:03:50.867Z","start_time_in_millis":1476788630867,"end_time":"2016-10-18T11:04:00.477Z","end_time_in_millis":1476788640477,"duration_in_millis":9610,"failures":[{"node_id":"ytzwGkYJQqWqvpJ_fkv3kQ","index":"solrsearch-2016.09.26","reason":"IndexShardSnapshotFailedException[[solrsearch-2016.09.26][2] failed to list blobs]; nested: NoSuchFileException[/backups/staging201/indices/solrsearch-2016.09.26/2]; ","shard_id":2,"status":"INTERNAL_SERVER_ERROR"},{"node_id":"ytzwGkYJQqWqvpJ_fkv3kQ","index":"solrsearch-2016.09.26","reason":"IndexShardSnapshotFailedException[[solrsearch-2016.09.26][1] failed to list blobs]; nested: NoSuchFileException[/backups/staging201/indices/solrsearch-2016.09.26/1]; ","shard_id":1,"status":"INTERNAL_SERVER_ERROR"},{"node_id":"ytzwGkYJQqWqvpJ_fkv3kQ","index":"solrsearch-2016.09.26","reason":"IndexShardSnapshotFailedException[[solrsearch-2016.09.26][3] failed to list blobs]; nested: NoSuchFileException[/backups/staging201/indices/solrsearch-2016.09.26/3]; ","shard_id":3,"status":"INTERNAL_SERVER_ERROR"}],"shards":{"total":5,"failed":3,"successful":2}}}
_
Messages in log file:
SEE NEXT MESSAGE