Snapshot policy will continue to backup fail when the backup jumps out of shard error

Version: Elasticsearch 7.16.2
S3 Storage Classes: new file saving S3 Standard, after month turn to S3 Glacier

Regularly back up the index of the same Aliases to s3 every day.
Recently, the following error suddenly popped up. May I ask if the error is because my s3 has been set to transfer to the glacier storage method for one month?

Snapshot Failed indices:
image

INTERNAL_SERVER_ERROR: AmazonS3Exception[The operation is not valid for the object's storage class (Service: Amazon S3; Status Code: 403; Error Code: InvalidObjectState; Request ID: 9T81R0DTBFXKRXHK; S3 Extended Request ID: Yhrwi1M188qhZRToZBwCYuBSY+GB4SuCERlErbrWKtGRGacqIFysdioYf83Aq8I0lnQSeKJbiz4=)]

Policy History:


{
  "type": "snapshot_exception",
  "reason": "[XXXX:XXXX-2023.07.06-6ohv3ltdtpmmf2gimegkxa] failed to create snapshot successfully, 28 out of 28 total shards failed",
  "stack_trace": "SnapshotException[[XXXX:XXXX-2023.07.06-6ohv3ltdtpmmf2gimegkxa] failed to create snapshot successfully, 28 out of 28 total shards failed]\n\tat org.elasticsearch.xpack.slm.SnapshotLifecycleTask$1.onResponse(SnapshotLifecycleTask.java:135)\n\tat org.elasticsearch.xpack.slm.SnapshotLifecycleTask$1.onResponse(SnapshotLifecycleTask.java:109)\n\tat org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:31)\n\tat org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:88)\n\tat org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:82)\n\tat org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:31)\n\tat org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$applyInternal$2(SecurityActionFilter.java:192)\n\tat org.elasticsearch.action.ActionListener$DelegatingFailureActionListener.onResponse(ActionListener.java:219)\n\tat org.elasticsearch.action.ActionListener$DelegatingActionListener.onResponse(ActionListener.java:186)\n\tat org.elasticsearch.action.ActionListener$MappedActionListener.onResponse(ActionListener.java:101)\n\tat org.elasticsearch.action.ActionListener.onResponse(ActionListener.java:293)\n\tat org.elasticsearch.snapshots.SnapshotsService.completeListenersIgnoringException(SnapshotsService.java:3413)\n\tat org.elasticsearch.snapshots.SnapshotsService.lambda$finalizeSnapshotEntry$31(SnapshotsService.java:2052)\n\tat org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:136)...

I would say that this is probably the issue, this is in the documentation on the part about the storage_class.

You may use an S3 Lifecycle Policy to adjust the storage class of existing objects in your repository, but you must not transition objects to Glacier classes and you must not expire objects. If you use Glacier storage classes or object expiry then you may permanently lose access to your repository contents.

1 Like

Thank you for your reply.

Because I have had no problems with backups for several months, suddenly this error started to appear in recent backups. Is it because I have set storage for one month to convert to Glacier classes?

Thanks!

This was already answered in the previous answer, you must not transition objects to Glacier classes and you may lose access to them if you do.

Not sure if you can recover it as the documentation says that you should not do it and warns about losing access.

Ok, thank you for your reply.

We are planning to use S3 Intelligent-Tiering for storage. If the backup file transition to Deep Archive Access, can it be retrieved and used normally?

1 Like

I'm not sure, I do not use S3 for snapshots, you have to wait for someone from Elastic to answer or open an issue on Github about it because the documentation does not mention this storage class.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.