We have noticed that the s3 input plugin does not recognise Glacier Flexible Retrieval storage classes. The result is that logstash takes forever to process our files. We would appreciate some input on how to address this, or a fix for the problem. What would be nice would be a way to filter only for objects using the S3 Standard storage class. Many Thanks
Please see the code snippet here: logstash-input-s3/s3.rb at main · logstash-plugins/logstash-input-s3 · GitHub
As below, the code will not recognise the hard-coded value “GLACIER” as AWS now distiguishes “Glacier Flexible Retrieval (Formerly “Glacier”), which is not identified in the code.
elsif log.last_modified > (current_time - CUTOFF_SECOND).utc # file modified within last two seconds will be processed in next cycle @logger.debug('Object Modified After Cutoff Time', :key => log.key) elsif (log.storage_class == 'GLACIER' || log.storage_class == 'DEEP_ARCHIVE') && !file_restored?(log.object) @logger.debug('Object Archived to Glacier', :key => log.key) else objects << log @logger.debug("Added to objects", :key => log.key, :length => objects.length) end end
S3 input plugin version: latest (see main branch above)