S3 input plugin does not recognise Glacier Flexible Retrieval

HI All,
We have noticed that the s3 input plugin does not recognise Glacier Flexible Retrieval storage classes. The result is that logstash takes forever to process our files. We would appreciate some input on how to address this, or a fix for the problem. What would be nice would be a way to filter only for objects using the S3 Standard storage class. Many Thanks

Please see the code snippet here: logstash-input-s3/s3.rb at main · logstash-plugins/logstash-input-s3 · GitHub

As below, the code will not recognise the hard-coded value “GLACIER” as AWS now distiguishes “Glacier Flexible Retrieval (Formerly “Glacier”), which is not identified in the code.

 elsif log.last_modified > (current_time - CUTOFF_SECOND).utc # file modified within last two seconds will be processed in next cycle
          @logger.debug('Object Modified After Cutoff Time', :key => log.key)
        elsif (log.storage_class == 'GLACIER' || log.storage_class == 'DEEP_ARCHIVE') && !file_restored?(log.object)
          @logger.debug('Object Archived to Glacier', :key => log.key)
        else
          objects << log
          @logger.debug("Added to objects[]", :key => log.key, :length => objects.length)
        end
      end

S3 input plugin version: latest (see main branch above)

I believe "GLACIER" is still the storage class type from an api level.

You might want to check which version of the s3 input plugin you are using with bin/logstash-plugin list --verbose in the command line of your logstash container. It needs to be >3.5.0

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.