Elasticsearch snapshot partialy fails due to "index shard snapshot failed exception"

Futerkowiec · November 6, 2019, 12:21pm

Kibana version: 7.4.2

Elasticsearch version: 7.4.2

Original install method (e.g. download page, yum, deb, from source, etc.) and version: all components were downloaded from official elastic page.

Fresh install or upgraded from other version? Upgraded from 7.3.2 and before it was 7.0.0

Is there anything special in your setup? 1x master node, 2x data nodes, 1x coordinating node, coordinating instance of elastic is installed on the same server as kibana, the rest are on separate machines .

Description of the problem including expected versus actual behavior. Please include screenshots (if relevant): My goal is to create a backup for our system indices which contains all of our visualizations, dashboards and other stuff we created. We decide to pick snapshots as solution especially that in 7.4 snapshot lifecycle management was introduced. I created repository using this:

PUT /_snapshot/elk_backup
{
  "type": "fs",
  "settings": {
    "location": "nightly_snapshot"
  }
}

Then i created policy to take care of daily snapshoting indices:

PUT /_slm/policy/nightly-snapshots
{
  "schedule": "0 0 20 ? * MON-FRI", 
  "name": "<nightly-snap-{now/d}>", 
  "repository": "elk_backup", 
  "config": { 
    "indices": [".kibana*"] 
  }
}

Snapshot was created in the night but i got partial failure on one out of five indices. Indices which i want to snapshot are:
.kibana_1
.kibana_2
.kibana_3
.kibana_task_manager_1
.kibana_task_manager_2
Error i got was:

INTERNAL_SERVER_ERROR: IndexShardSnapshotFailedException[ElasticsearchException[failed to
create blob container]; nested: AccessDeniedException[/elastic_backup/nightly_snapshot/indices
/t2UIMX5kQaS2_B7nWGl8Kg/0];]; nested: ElasticsearchException[failed to create blob container]; 
nested: AccessDeniedException[/elastic_backup/nightly_snapshot/indices
/t2UIMX5kQaS2_B7nWGl8Kg/0];

To check if it is one time problem, I run the policy manualy and then i got the same error but for four indices. I checked permissions to directory and they are the same for every directory. In elasticsearch log I did not find any errors. I do not understand why some indices/shards failed and some did not. Why some of those indices give me access denied when making a snapshot?

Could you help me solve this problem?

if anything else is needed let me know.
Thank you in advance!

Armin_Braun · November 7, 2019, 12:06pm

hi @Futerkowiec,

could you give us a little more insight into what kind of shared file system you have mounted at your snapshot location (/elastic_backup/nightly_snapshot/) please?
Is it a SMB share by any chance?

Thanks!

Futerkowiec · November 7, 2019, 12:48pm

Hi @Armin_Braun,

Thanks for response. We are using NFS4.

If more information is needed let me know.

Armin_Braun · November 7, 2019, 1:08pm

@Futerkowiec

This looks like it may be slowness and insufficient timeouts in your NFS configuration (just guessing) or some other transient issue with the NFS setup (those often bubble up as AccessDeniedException in Java). Could you check the dmesg (from the data nodes ) output around the time of the failing snapshot for NFS related errors (or paste it here if unsure)?

Futerkowiec · November 12, 2019, 11:36am

It looks like different machines have different UIDs, so if one node create a file other might not have permissions.

system · December 10, 2019, 11:36am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ELK Snapshots Fail: Elasticsearch snapshot-and-restore	4	365	October 28, 2022
Snapshot partial failure on cluster Elasticsearch	8	1349	June 6, 2019
Snapshot error Kibana slm-snapshot-lifecycle-management	2	360	February 15, 2023
Failed to snapshot shard# 2 Elasticsearch docker	1	319	January 20, 2023
Snapshot ends with partially state Elasticsearch	1	421	June 12, 2018

Elasticsearch snapshot partialy fails due to "index shard snapshot failed exception"

Related topics