S3 logstash conf grok filter

Hello.
I want to collect s3 access logs from an s3 bucket and process them to logstash and elasticsearch. I have it working properly but the filter in the logstash conf is not working properly. Currently its giving me too much information when I only specific parts in the message.

So on kibana I'm getting logs like:
message:

{"Records":[{"eventVersion":"1.05","userIdentity":{"type":"AssumedRole","principalId":"AROAJKLFDKRWTVGOAWDHWH:i-0276d215093829d49","arn":"arn:aws:sts::204324406053:assumed-role/EC2forSSM-Scaling/i-0276d215093829d49","accountId":"204324406053","accessKeyId":"ACCESSKEYIDFJDN3246","sessionContext":{"sessionIssuer":{"type":"Role","principalId":"AROAJKLFDKRWTVGOAWDHWH","arn":"arn:aws:iam::205915406053:role/EC2forSSM-Scaling","accountId":"204324406053","userName":"EC2forSSM-Scaling"},"webIdFederationData":{},"attributes":

message:

{"Records":[{"eventVersion":"1.05","userIdentity":{"type":"AWSService","invokedBy":"autoscaling.amazonaws.com"},"eventTime":"2020-10-26T16:28:51Z","eventSource":"sts.amazonaws.com","eventName":"AssumeRole","awsRegion":"us-east-1","sourceIPAddress":"autoscaling.amazonaws.com","userAgent":"autoscaling.amazonaws.com","requestParameters":{"roleArn":"arn:aws:iam::204324406053:role/aws-service-

How do I createa grok filter to allow me to filter by BUCKET event NAME SOURCE IP, Username - but exclude other personal info like arn number, account id, etc. ?

Only want to make it useful for s3 access log activity. do not need too much additional information.

PS the values I have in here are not real - I changed them for the example

Records is an array. Are you going to use a split filter to split that into multiple events?

If you are you may be able to use a prune filter with the whitelist_names option to specify which fields to keep.

If you are not you would have to use ruby code to iterate over the array and (in effect) implement the prune yourself.

Can you give a basic structure on how to split the events and insert the prune filter in this filter?

filter {
   split {
     field => "Records"
        }
   prune {
        whitelist_names => [ "principalId", "arn", "accountId", "accessKeyId", etc
  }
   }
 }

That split looks fine. However, prune is not going to work because it only operates on top level fields.

If you have a small number of fields you want to retain then you could do something like

    split { field => "Records" }
    mutate {
        add_field => {
            "[Record][userIdentity][sessionContext][sessionIssuer][arn]" => "%{[Records][userIdentity][sessionContext][sessionIssuer][arn]}"
            "[Record][userIdentity][sessionContext][sessionIssuer][userName]" => "%{[Records][userIdentity][sessionContext][sessionIssuer][userName]}"
            "[Record][userIdentity][accessKeyId]" => "%{[Records][userIdentity][accessKeyId]}"
        }
        remove_field => [ "Records" ]
    }

Obviously you do not have to use the same structure on the left that you have on the right. You could also do

    mutate {
        add_field => {
            "[arn]" => "%{[Records][userIdentity][sessionContext][sessionIssuer][arn]}"
            "[userName]" => "%{[Records][userIdentity][sessionContext][sessionIssuer][userName]}"
            "[accessKeyId]" => "%{[Records][userIdentity][accessKeyId]}"
        }
        remove_field => [ "Records" ]
    }

[2020-10-28T13:27:59,738][WARN ][logstash.filters.split ][main] Only String and Array types are splittable. field:Records is of type = NilClass
[2020-10-28T13:27:59,741][WARN ][logstash.filters.split ][main] Only String and Array types are splittable. field:Records is of type = NilClass
[2020-10-28T13:27:59,756][WARN ][logstash.filters.split ][main] Only String and Array types are splittable. field:Records is of type = NilClass
[2020-10-28T13:27:59,775][WARN ][logstash.filters.split ][main] Only String and Array types are splittable. field:Records is of type = NilClass
[2020-10-28T13:28:11,538][WARN ][logstash.filters.split ][main] Only String and Array types are splittable. field:Records is of type = NilClass
[2020-10-28T13:28:11,760][WARN ][logstash.filters.split ][main] Only String and Array types are splittable. field:Records is of type = NilClass

I tried both filters above but get the following warning.

That is telling you that there are events that do not have a [Records] field. You could wrap the split and mutate in

if [Records] {
...
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.