S3 input plugin failed to stream from Glaciers

Hello there,

We just set up ELK to stream event/logs from s3 to logstash using s3 input plugin (all version 6.3.2).
We can stream all standard files, but we have the lifycircle enabled to save all files older then 90 days to Glacier. I read online, it seems the S3 input plugin explicitly forbid the GLACIER storage class.... and I don't see any workaround.

Here is one of the items in the logstash log (many of such)...

S3 input: Unable to download remote file {:remote_key=>"AWSLogs/xxx/CloudTrail/us-east-1/xxxx/xx/xx/xxxxxx_CloudTrail_us-east-1_20171105T0430Z_bbLZTyIdIkYfp2Su.json.gz", :message=>"The operation is not valid for the object's storage class"}

Do we have any workaround on this? We can not change the the LifeCircle rules...

Thank you very much

Data that is archived from S3 to Glacier cannot be read directly by any of Amazon's APIs; the resource must first be restored to S3 (which will cost you differing amounts of money depending on how quickly you need it restored).

From Amazon's Documentation:

Objects archived to Amazon Glacier are not accessible in real-time. You must first initiate a restore request and then wait until a temporary copy of the object is available for the duration (number of days) that you specify in the request. The time it takes restore jobs to complete depends on which retrieval option you specify Standard, Expedited, or Bulk.

-- Amazon S3 Docs -- Restoring Objects

The S3 Input Plugin does not currently provide support for restoring archived objects, and merely skips archived objects that it encounters. If this feature were implemented in future, significant care would need to be taken to ensure that a poorly-configured pipeline couldn't go rogue and repeatedly restore the same objects, since each restoration of an object from glacier costs real money.


Thank you for your response... you are right. Logstash s3 plugin can not read directly from Glasier... we don't want to read data from Glacier either.
could you please let me know how to avoid or skip reading data/files on Glacier using logstash s3 input plugin? I tried to set up Storage_class => Standard, it seems not working.... we use logstash s3 input plugin v3.3.7.

Thanks a lot

I've opened up a PR on the plugin to add support for skipping glacier-archived objects: https://github.com/logstash-plugins/logstash-input-s3/pull/160

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.