Dynamic Bucket Names or Directories in AWS S3 Output

umutcan · June 11, 2015, 12:42pm

Hi,

I am doing some tests on storing data on AWS S3. I have read the documents and couldn't find a dynamic bucket name or directory option like Elasticsearch output provides in index name. Is there a way to do this?

Thank you

warkolm · June 13, 2015, 5:01am

You can use exactly the same method for that output.

umutcan · June 17, 2015, 12:00pm

Can you provide an example for it? I tried it several ways and it doesn't work. I am using 1.5.0 version of Logstash.

warkolm · June 18, 2015, 2:04am

Providing what you have tried would be useful

tristan · June 23, 2015, 9:54pm

I've been trying to get this to work also. My dev set up works just fine with a config file of

input {
  file {
    path => "/srv/log/app/server/*.log"
  }
}

filter {
  grok {
    match => ["path","%{GREEDYDATA:folder}/%{GREEDYDATA:filename}\.log"]
  }
}

output {
  s3 {
    bucket => "test"
    prefix => "test/"
    size_file => 2048
    time_file => 5
    canned_acl => "private"
    codec => rubydebug
  }
  stdout { codec => rubydebug }
}

I'd like to be able to use a wild card in the prefix to do something like prefix => "test/%{folder}", which the docs made me think might work, but that doesn't seem to work.

magnusbaeck · June 24, 2015, 5:49am

I'd like to be able to use a wild card in the prefix to do something like prefix => "test/%{folder}", which the docs made me think might work, but that doesn't seem to work.

Sorry, that won't work since %{varname} interpolation doesn't take place for the prefix parameter's value (the highlighted line indicates that we're using the raw parameter value, @prefix, instead of event.sprintf(@prefix)):

github.com

logstash-plugins/logstash-output-s3/blob/v0.1.7/lib/logstash/outputs/s3.rb#L143


      
            return {
              :s3_endpoint => region_to_use == 'us-east-1' ? 's3.amazonaws.com' : "s3-#{region_to_use}.amazonaws.com"
            }
          end
          
          public
          def write_on_bucket(file)
            # find and use the bucket
            bucket = @s3.buckets[@bucket]
          
            remote_filename = "#{@prefix}#{File.basename(file)}"
          
            @logger.debug("S3: ready to write file in bucket", :remote_filename => remote_filename, :bucket => @bucket)
          
            File.open(file, 'r') do |fileIO|
              begin
                # prepare for write the file
                object = bucket.objects[remote_filename]
                object.write(fileIO, :acl => @canned_acl)
              rescue AWS::Errors::Base => error
                @logger.error("S3: AWS error", :error => error)

However, fixing this is probably not entirely simple since it would mean that the output file could potentially change between every single message received by the output.

tristan · June 25, 2015, 12:50am

Thanks, Now that I think about it I see how that would be a huge problem.

haroldwoo · August 12, 2015, 7:40pm

How about the other way around? Is logstash able to support dynamic bucket names in s3 input similar to how it does for file inputs?

e.g.
input {
s3 {
bucket => "logbucket"
prefix => "logs/*/2015/01/01/"
}
}

I want to use a grok filter on the s3 prefix to add fields to my log entries

grok {
match => [ "prefix", "logs/%{GREEDYDATA:projectName/2015/01/01/" ]
}

matt.koivisto · February 8, 2016, 1:46pm

Since my set of prefix's was known, I worked around this by putting if conditions and hard coded prefixes:

output {
  # Until s3 output supports variables in prefix
  if [fields][host] == "foohost" {
     s3 {
       access_key_id => "<your access>"
       secret_access_key => "<your secret>"
       bucket => "host-logs"
       time_file => 60
       prefix => "foohost"
    }
  }
  if [fields][host] == "barhost" {
     s3 {
       access_key_id => "<your access>"
       secret_access_key => "<your secret>"
       bucket => "host-logs"
       time_file => 60
       prefix => "barhost"
    }
  }
  ...
}

xxDECKERxx · February 15, 2017, 8:57pm

I am currently trying to do something similar with my s3 output. I have input configurations coming from multiple file locations, and depending on the directory the logs come from I am set the "type" field to a specific value. Is there no way to use a field value as a prefix or part of the tags? I need to make different s3 bucket objects based on the type otherwise they all get written to the same object in s3.

The work-around would be to use if conditions, but it would be simpler to be able to use references in the s3 output.

Topic		Replies	Views
Logstash to s3 dynamic bucket Logstash	1	507	June 28, 2020
Logstash s3 input plugin with dynamic prefix Logstash	4	4800	May 2, 2018
Logstash S3 input plugin - prefix wildcard Logstash	2	1143	September 28, 2017
Adding S3 Bucket Name in Output Logs Logstash	4	1167	January 8, 2018
Logstash s3 input plugin specify dynamic prefix Logstash	1	321	August 28, 2020

Dynamic Bucket Names or Directories in AWS S3 Output

Related topics