Recursively Grok lines and streams


(Eric) #1

I'm looking to use grok to parse through lines and streams of data. I'll explain how.

Let's say I have a line of data of:

221.37.88.36.bc.googleusercontent.com,63.88.73.122,63.88.73.0,-,,-,Google Inc.,Mountain View,CA,US,Google Inc.,Mountain View,CA,US

We can see there are some noticeable information in there, such as IP addresses, hostnames, city, state, country.

I'm trying to make a grok parser to extract data out of this line incrementally, where through each grok filter it will remove what was parsed out and feed the remainder into the next grok filter.

For example:

Let's take an input from a TCP port

input {
   tcp { port => "4382" }
}

And feed it through grok

filter {
    # GROK PARSER 01
    grok {
        match => { "message", "%{HOSTNAME:Hostname}" }    # This will parse out all hostnames from the line
    }
    # GROK PARSER 02
    grok {
        match => { "message", "%{IPV4:IP}" }    # This will parse out all IPV4 address from the line
    }
    # GROK PARSER 03
    grok {
        match => { "message", "%{GREEDYDATA:data}" }    # This will encapsulate the rest of the information
    }
}

After GROK PARSER 01, we'll end up parsing out anything that is a hostname

            "message" => [
        [0] "221.37.88.36.bc.googleusercontent.com,63.88.73.122,63.88.73.0,-,,-,Google Inc.,Mountain View,CA,US,Google Inc.,Mountain View,CA,US"
    ],
           "@version" => "1",
         "@timestamp" => "2015-07-15T19:33:46.261Z",
           "Hostname" => "221.37.88.36.bc.googleusercontent.com"
}

Then, GROK PARSER 02, will parse any IPV4 address

            "message" => [
        [0] ",63.88.73.122,63.88.73.0,-,,-,Google Inc.,Mountain View,CA,US,Google Inc.,Mountain View,CA,US"
    ],
           "@version" => "1",
         "@timestamp" => "2015-07-15T19:33:46.261Z",
                 "IP" => [
        [0] "63.88.73.122",
        [1] "63.88.73.0"
    ]
}

And, lastly, GROK PARSER 03, will hold what's left

            "message" => [
        [0] ",,,-,,-,Google Inc.,Mountain View,CA,US,Google Inc.,Mountain View,CA,US"
    ],
           "@version" => "1",
         "@timestamp" => "2015-07-15T19:33:46.261Z",
               "data" => ",,,-,,-,Google Inc.,Mountain View,CA,US,Google Inc.,Mountain View,CA,US"
}

How can we make this happen?


(system) #2