Logstash s3 input plugin with dynamic prefix


we have a special requirement where we have to read data from s3 bucket in which folders are dynamic(e.g s3bucket/2018/04/04)

simply avoid setting a prefix and instead use an exclude_pattern to exclude files you don't want to include.

@yaauie thanks for the prompt reply , but scenario is something different ,

I have a service which creates folder like bucketname/MM/DD/YY so for each day there will be different folder , I want that prefix part in s3 input plugin can dynamically take something like this , prefix => "MM/DD/YY" , so everyday there will be new folder , where MM/DD/YY should be coming from current date .
Thus using this i will processing current day file .

But if there's some other work around do let me know .

If you were to dynamically generate the prefix, things would get really weird around midnight.

Interestingly enough, S3 buckets don't have "folders" -- they allow forward-slashes in filenames, and their web UI can present them in a folder-ish way, but the APIs used by Logstash and others treat all documents as if they were in a single flat bucket.

When the Logstash S3 input points at a bucket, it "notices" any new file that shows up based on creation timestamp, regardless of its path; if a prefix is specified, it will skip any files that don't start with that prefix, and if an exclude_pattern is specified, it will skip any file whose name matches the pattern.

If you only put logs that you want Logstash to read in this bucket, configure the plugin without a prefix or an exclude_pattern, and it will simply discover all files as they are added to the bucket.

If the bucket also contains files that you don't want Logstash to read, you have two options:

  • if the files that you do want to read have a consistent, literal prefix, configure logstash-input-s3 with a prefix directive
  • use the exclude_pattern to explicitly exclude files

As an aside, if you do want to have the date be part of your filename/path, I would suggest an ISO-8601-compliant prefix, because it is also naturally lexically ordered, which makes things a lot easier in the long-run:


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.