Hi, we have an intermittent issue with snapshots via fs on NFS4.
Once a day we close the repository for a backup to tape. After this backup has been taken we delete the data and create a new repository and an initial snapshot.
Then, we are taking snapshots every 15 min. Very few of these snapshots report a PARTIAL status for some 1-3 indices.
E.g.
IndexShardSnapshotFailedException[Failed to write commit point]; nested: FileSystemException[/backuprepo/indices/kEmnbUZJSBKVcNFMND95Hw/0/snap-CbacWLlcSw2xagCBh6GxQQ.dat: Input/output error]
or
IndexShardSnapshotFailedException[java.nio.file.DirectoryIteratorException: java.nio.file.FileSystemException: /data/backup-folder/smg-pp-b_backuprepo/indices/pK8Fc_wfS1-yVde-ccwrXw/0: Input/output error]; nested: DirectoryIteratorException[java.nio.file.FileSystemException: /data/backup-folder/smg-pp-b_backuprepo/indices/pK8Fc_wfS1-yVde-ccwrXw/0: Input/output error]; nested: FileSystemException[/backuprepo/indices/pK8Fc_wfS1-yVde-ccwrXw/0: Input/output error];
We also face other error messages.
The next snapshot report messages always report SUCCESS.
As I suspect a rare edge-case NFS4 bug being the root cause, my question is:
If a snapshot reports PARTIAL status and the next ones report SUCCESS, do these successful snapshots heal the affected repositor?. For better understanding: if an index is not snapshotted successfully, does the next successful snapshot fix the issue, so the backed up data is valid and consistent?
Thanks in advance