ELK Snapshots Fail:

Hi elastic community,

I’m have some new problems with snapshots failing.

A little background: We are required to archive our log data and keep it for a long time. What we do is create a single snapshot policy per indices and take the snapshot. Once the snapshot is “archived” on our repository server we delete the indices from our production ELK stack. This is how we manage our small cluster to keep from filling up the data storage.

I have over 120 snapshots each of single individual indices. It’s been a few months since I have gone through and “cleaned out” our cluster of older indices in this manner.

Currently when I create a new policy for a target indices and run the policy to make a snapshot, it fails. The failure is a “Partial failure”. In the “Failed indices” tab in Kibana the failure states:

INTERNAL_SERVER_ERROR: UncategorizedExecutionException[Failed execution]; nested: ExecutionException[java.nio.file.FileSystemException: \SERVER\SHAREDSTORAGE\indices%filename%\0%filename%: The process cannot access the file because it is being used by another process.

]; nested: FileSystemExceptoin[\SERVER\SHAREDSTORAGE\indices%filename\0%filename%: The process cannot access the file because it is being used by another process.

]

The repository server is up and when I verify the repository it shows connected and the list of our ELK stack nodes.

I’ve done some searching about this but am not finding anything useful or related to my specific issue.

Any ideas on the problem here?

Well to report/ update on this...

I have been trying for a week now to get these snapshots to work and yesterday afternoon one just completed. No change to the system. I don't understand why they are now just completing with no errors but they are. This topic can be closed.

Is this a NFS repository?

Maybe something else in the repository server was acessing the file, like some antivirus or endpoint detection software?

It is a NFS server. Nothing was accessing the files. It seems more like something to do with ELK stack in general and how it access older indices. We are using snapshots to "archive" old logs so these snapshots are of old indices that have not been touched in months. Once we have a good snapshot we delete the indices from the cluster, recovering and saving cluster disk space.

It's working now and I have documented this in our documentation for future reference as we won't be purging old indices again for another year once I get though everything now.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.