Performance split for logstash and s3

sc5283 · April 28, 2020, 10:13pm

Is there a way to split the load on multiple logstash instances to grab data from S3?
example:
I would like for logstash instance 1 to grab only files ending in 1 and 2
logstash instance 2 to grab files ending in 3 and 4
and so on.
S3 looks like:
bucket/2020/04/26/20/file001.gz
bucket/2020/04/26/20/file002.gz
bucket/2020/04/26/21/file003.gz
bucket/2020/04/27/01/file004.gz
bucket/2020/04/27/03/file005.gz
bucket/2020/04/28/18/file006.gz
bucket/2020/04/28/19/file007.gz
bucket/2020/04/28/20/file008.gz
....

is there a specific configuration option in logstash that allows for this?
or maybe a different plug-in? or solution?

Badger · April 28, 2020, 10:22pm

I do not think that is possible. You could use different prefixes (bucket/2020/04/2, bucket/2020/04/1, etc.) but you cannot use a regexp.

Luca_Belluccini · April 28, 2020, 10:25pm

Hello @sc5283

It is not possible to filter by the suffix of the files, but as @Badger says, it is possible to partition the inputs using the bucket name and a prefix within the bucket.

All the options are shown in our doc.

Alternatives might be:

Using different buckets, one input per bucket with Logstash S3 Input
Using one bucket and different prefixes, one input per bucket & prefix with Logstash S3 Input
Enable SQS and use Logstash S3 SNS SQS Input (https://github.com/cherweg/logstash-input-s3-sns-sqs)
Enable SQS and use Filebeat with S3 Input (https://www.elastic.co/guide/en/beats/filebeat/master/filebeat-input-s3.html#filebeat-input-s3)
Enable SQS and use Functionbeat (https://www.elastic.co/guide/en/beats/functionbeat/master/configuration-functionbeat-options.html#configuration-functionbeat-options)

system · May 26, 2020, 10:25pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Guidance With S3 Input Plugin Logstash	3	1549	April 30, 2019
Logstash S3 input plugin - prefix wildcard Logstash	2	1148	September 28, 2017
Logstash S3 input support for folders in bucket Logstash	1	500	July 27, 2017
Dynamic Bucket Names or Directories in AWS S3 Output Logstash	10	7988	July 6, 2017
Logstash s3 input plugin: not indexing files under individual folder of bucket using prefix option Logstash	1	310	December 21, 2020

Performance split for logstash and s3

Related topics