We just set up ELK to stream event/logs from s3 to logstash using s3 input plugin (all version 6.3.2).
We can stream all standard files, but we have the lifycircle enabled to save all files older then 90 days to Glacier. I read online, it seems the S3 input plugin explicitly forbid the GLACIER storage class.... and I don't see any workaround.
Here is one of the items in the logstash log (many of such)...
S3 input: Unable to download remote file {:remote_key=>"AWSLogs/xxx/CloudTrail/us-east-1/xxxx/xx/xx/xxxxxx_CloudTrail_us-east-1_20171105T0430Z_bbLZTyIdIkYfp2Su.json.gz", :message=>"The operation is not valid for the object's storage class"}
Do we have any workaround on this? We can not change the the LifeCircle rules...
Data that is archived from S3 to Glacier cannot be read directly by any of Amazon's APIs; the resource must first be restored to S3 (which will cost you differing amounts of money depending on how quickly you need it restored).
From Amazon's Documentation:
Objects archived to Amazon Glacier are not accessible in real-time. You must first initiate a restore request and then wait until a temporary copy of the object is available for the duration (number of days) that you specify in the request. The time it takes restore jobs to complete depends on which retrieval option you specify Standard, Expedited, or Bulk.
The S3 Input Plugin does not currently provide support for restoring archived objects, and merely skips archived objects that it encounters. If this feature were implemented in future, significant care would need to be taken to ensure that a poorly-configured pipeline couldn't go rogue and repeatedly restore the same objects, since each restoration of an object from glacier costs real money.
Thank you for your response... you are right. Logstash s3 plugin can not read directly from Glasier... we don't want to read data from Glacier either.
could you please let me know how to avoid or skip reading data/files on Glacier using logstash s3 input plugin? I tried to set up Storage_class => Standard, it seems not working.... we use logstash s3 input plugin v3.3.7.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.