Performance split for logstash and s3

Is there a way to split the load on multiple logstash instances to grab data from S3?
example:
I would like for logstash instance 1 to grab only files ending in 1 and 2
logstash instance 2 to grab files ending in 3 and 4
and so on.
S3 looks like:
bucket/2020/04/26/20/file001.gz
bucket/2020/04/26/20/file002.gz
bucket/2020/04/26/21/file003.gz
bucket/2020/04/27/01/file004.gz
bucket/2020/04/27/03/file005.gz
bucket/2020/04/28/18/file006.gz
bucket/2020/04/28/19/file007.gz
bucket/2020/04/28/20/file008.gz
....

is there a specific configuration option in logstash that allows for this?
or maybe a different plug-in? or solution?

I do not think that is possible. You could use different prefixes (bucket/2020/04/2, bucket/2020/04/1, etc.) but you cannot use a regexp.

Hello @sc5283

It is not possible to filter by the suffix of the files, but as @Badger says, it is possible to partition the inputs using the bucket name and a prefix within the bucket.

All the options are shown in our doc.


Alternatives might be:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.