Dropping / Mutating first part of log event

Hi there

I'm attempting to parse the following log entry:

A, [2016-01-27T11:49:29.702997 #90] ANY -- : 2016-01-27 11:49:29 +0000 severity=INFO, {"method":"GET","path":"/404","format":"/","controller":"errors","action":"routing","status":200,"duration":4.45,"view":2.34,"db":0.0,"current_user":null,"current_user_id":null,"current_user_something":null,"request_ip":"127.0.0.1","@timestamp":"2016-01-27T11:49:29.702Z","@version":"1","message":"[200] GET /404 (errors#routing)"}

so I only get use part the includes

{"method": ......... routing)"}

(i.e. drop on the floor " A, [2016-01-27T11:49:29.702997 #90] ANY -- : 2016-01-27 11:49:29 +0000 severity=INFO, ")

which I believe will be filtered correctly using the default json filter. Can anyone help me break this out using grok (and/or/mutate/kv filtering)? so I can add this to my ES indices?

Thanks very much in advance!

You're on the right track in using a grok filter to remove the boring parts and using a json filter to parse the remainder. Try using http://grokconstructor.appspot.com/ to construct your grok expression. Its incremental construction feature should be helpful.

Try this:

filter {
    grok {
        match => ["message", "(?<fields_to_keep>{%{GREEDYDATA}})"]
    }

    json {
        source => "fields_to_keep"
    }

    mutate {
        remove_field => [ "fields_to_keep" ]

    }
}

Thanks very much for the assistance thus far!

How do I specify which fields to keep in my

    match => ["message", "(?<fields_to_keep>{%{GREEDYDATA}})"]

line? Until my colleague makes a modification to the logged output I can get the entries that I'm after by performing the following:

awk '{ print $11 $12 $13,$14 }' production.log > clean-production.log

though obviously I want to perform that ongoing at the logstash ingress stage.

I've used the grok constructor but I'm not quite sure I understand it correctly; I don't want to establish the fields I don't want as any particular entry (e.g. I don't care about
[2016-01-27T11:49:29.702997 #90] being [TOMCATDATESTAMP] etc), I just want to discard everything aside from the content between the {} inclusive...

Thanks again for the help!

How do I specify which fields to keep in my ... line?

That expression just extracts everything between "{" and "}" (including the braces), which should work as long as the first "{" occurrence is the start of the JSON string you're interested in.

I've used the grok constructor but I'm not quite sure I understand it correctly; I don't want to establish the fields I don't want as any particular entry (e.g. I don't care about
[2016-01-27T11:49:29.702997 #90] being [TOMCATDATESTAMP] etc), I just want to discard everything aside from the content between the {} inclusive...

That's what @mick66's expression does. But again, if there are other curly braces in the string you might be in for a surprise.

Hi guys,

Thanks again for your help. @mick66 that's exactly what I was after, and @magnusbaeck thanks, I will make sure with my colleagues that there will be no unexpected braces anywhere!!