Hello,
we have a special requirement where we have to read data from s3 bucket in which folders are dynamic(e.g s3bucket/2018/04/04)
Hello,
we have a special requirement where we have to read data from s3 bucket in which folders are dynamic(e.g s3bucket/2018/04/04)
simply avoid setting a prefix
and instead use an exclude_pattern
to exclude files you don't want to include.
@yaauie thanks for the prompt reply , but scenario is something different ,
I have a service which creates folder like bucketname/MM/DD/YY so for each day there will be different folder , I want that prefix part in s3 input plugin can dynamically take something like this , prefix => "MM/DD/YY" , so everyday there will be new folder , where MM/DD/YY should be coming from current date .
Thus using this i will processing current day file .
But if there's some other work around do let me know .
If you were to dynamically generate the prefix, things would get really weird around midnight.
Interestingly enough, S3 buckets don't have "folders" -- they allow forward-slashes in filenames, and their web UI can present them in a folder-ish way, but the APIs used by Logstash and others treat all documents as if they were in a single flat bucket.
When the Logstash S3 input points at a bucket, it "notices" any new file that shows up based on creation timestamp, regardless of its path; if a prefix
is specified, it will skip any files that don't start with that prefix, and if an exclude_pattern
is specified, it will skip any file whose name matches the pattern.
If you only put logs that you want Logstash to read in this bucket, configure the plugin without a prefix
or an exclude_pattern
, and it will simply discover all files as they are added to the bucket.
If the bucket also contains files that you don't want Logstash to read, you have two options:
logstash-input-s3
with a prefix
directiveexclude_pattern
to explicitly exclude filesAs an aside, if you do want to have the date be part of your filename/path, I would suggest an ISO-8601-compliant prefix, because it is also naturally lexically ordered, which makes things a lot easier in the long-run:
YYYY/MM/DD
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.