Hi,
I'm having a weird issue. When logstash pulls messages down from S3 and passes them over to ES, not all of the messages get their S3 metadata key value. Some messages get %{[@metadata][s3][key]}
as their file field, whereas some get the actual file name. For the ones not getting the file field added, they're not getting their field type set, unlike the working ones that get the correct type set in the input.
I tried moving the add_field
up into the input, but that failed; I'm guessing it's not added to the message at until the input is done getting the message? Running logstash in debug mode doesn't yield anymore information. Every file it pulls in has a key. I'm at a loss. Anyone seen anything like this?
[2019-07-18T14:42:34,838][DEBUG][logstash.inputs.s3 ] S3 input: Found key {:key=>"elasticmapreduce/clusterid/containers/application_11111111111111_0219/container_e02_11111111111111_0219_03_000022/stderr.gz"}
[2019-07-18T14:42:34,839][DEBUG][logstash.inputs.s3 ] S3 input: Adding to objects[] {:key=>"elasticmapreduce/clusterid/containers/application_11111111111111_0219/container_e02_11111111111111_0219_03_000022/stderr.gz"}
[2019-07-18T14:42:34,839][DEBUG][logstash.inputs.s3 ] objects[] length is: {:length=>21347}
Here is the entirety of my config:
input {
s3 {
bucket => "*******"
access_key_id => "*******"
secret_access_key => "*******"
region => "us-west-2"
include_object_properties => true
prefix => "elasticmapreduce/cluster-id/containers/"
temporary_directory => "/tmp/logstash/clusterid"
codec => multiline {
pattern => "^%{GREEDYDATA}"
negate => false
what => previous
}
add_field => { logtype => "s3.emr" }
add_field => { "environment" => "clusterid" }
}
}
filter {
mutate {
add_field => { "file" => "%{[@metadata][s3][key]}" }
}
}
output {
elasticsearch {
hosts => "*******"
ssl => true
}
}