Im currently using the Logstash S3 input module to ingest flowlog data from an S3 bucket however it isn't pulling in enough data quick enough and so falling behind. I've tried upping the max batch size to 20000 in logstash.yml and also set the S3 input interval to be 2 seconds to no avail.
I can't see a way to create multiple pipelines without potential for duplication - I do have the input set to move the processed objects to another bucket but another pipeline could still potentially read the same object at the same time.
Basically just looking for some advice on best approach to this as im about to start writing something that will pull in multiple S3 objects and create local logfiles on the logstash server to ingest. Any suggestions would be greatly appreciated!