Unable to create a new snapshot

We are running 5.6.14 with 3 master and 8 data nodes. For some reason I am not able to create a new snapshot. I see this is one of the master node logs:
[2019-03-29T20:13:51,535][WARN ][o.e.s.SnapshotsService ] [hostname] [s3_prod_repository][snapshot-03-29] failed to create snapshot
org.elasticsearch.snapshots.ConcurrentSnapshotExecutionException: [s3_prod_repository:snapshot-03-29] a snapshot is already running
at org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:266) ~[elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:45) ~[elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.cluster.service.ClusterService.executeTasks(ClusterService.java:634) ~[elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.cluster.service.ClusterService.calculateTaskOutputs(ClusterService.java:612) ~[elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:571) [elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) [elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:576) [elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) [elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) [elasticsearch-5.6.14.jar:5.6.14]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]

I am trying to put that snapshot on S3 repo. There is no other snapshot running. When I asked for GET /_snapshot/_status, it says my snapshot has STARTED and it will also finish eventually. I restarted the entire cluster few times but still no luck. What could be wrong here?

One more thing we found is it is only happening when we specified wait_for_completion=true.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.