Logstash s3 input plugin with dynamic prefix

vikas-prabhakar · April 3, 2018, 7:23pm

Hello,

we have a special requirement where we have to read data from s3 bucket in which folders are dynamic(e.g s3bucket/2018/04/04)

yaauie · April 3, 2018, 11:01pm

simply avoid setting a prefix and instead use an exclude_pattern to exclude files you don't want to include.

vikas-prabhakar · April 4, 2018, 9:21am

@yaauie thanks for the prompt reply , but scenario is something different ,

I have a service which creates folder like bucketname/MM/DD/YY so for each day there will be different folder , I want that prefix part in s3 input plugin can dynamically take something like this , prefix => "MM/DD/YY" , so everyday there will be new folder , where MM/DD/YY should be coming from current date .
Thus using this i will processing current day file .

But if there's some other work around do let me know .

yaauie · April 4, 2018, 10:47pm

If you were to dynamically generate the prefix, things would get really weird around midnight.

Interestingly enough, S3 buckets don't have "folders" -- they allow forward-slashes in filenames, and their web UI can present them in a folder-ish way, but the APIs used by Logstash and others treat all documents as if they were in a single flat bucket.

When the Logstash S3 input points at a bucket, it "notices" any new file that shows up based on creation timestamp, regardless of its path; if a prefix is specified, it will skip any files that don't start with that prefix, and if an exclude_pattern is specified, it will skip any file whose name matches the pattern.

If you only put logs that you want Logstash to read in this bucket, configure the plugin without a prefix or an exclude_pattern, and it will simply discover all files as they are added to the bucket.

If the bucket also contains files that you don't want Logstash to read, you have two options:

if the files that you do want to read have a consistent, literal prefix, configure logstash-input-s3 with a prefix directive
use the exclude_pattern to explicitly exclude files

As an aside, if you do want to have the date be part of your filename/path, I would suggest an ISO-8601-compliant prefix, because it is also naturally lexically ordered, which makes things a lot easier in the long-run:

YYYY/MM/DD

system · May 2, 2018, 10:47pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
S3 dynamic date prefix in folder name by using current date as backup_add_prefix Logstash	2	1052	February 12, 2020
Logstash s3 input plugin specify dynamic prefix Logstash	1	299	August 28, 2020
Dynamic Prefix with s3 input plugin Logstash	1	448	November 29, 2019
S3 dynamic folder naming in yyyy/mm/dd Logstash	2	1295	July 6, 2017
S3 No files found in bucket Logstash	5	713	September 21, 2021

Logstash s3 input plugin with dynamic prefix

Related topics