Match variants of logs


(Luke Whitworth) #1

Hey all,

I'm ingesting some logs where the lines start the same (timestamp) and then vary thereafter with multiple fields appearing but in various orders, e.g.

File 1:

Tue Jan 5 13:01:21 2016 Packet-Type = Access-Request NAS-Port-Id = "AA111/2" Calling-Station-Id = "FF-FF-FF-FF-FF-FF" Called-Station-Id = "AA-AA-AA-AA-AA-AA:wireless" Service-Type = Framed-User User-Name = "bob@bob.com"

Tue Jan 5 13:02:21 2016 Packet-Type = Access-Request NAS-Port-Id = "AA111/2" User-Name = "bob@bob.com" Calling-Station-Id = "FF-FF-FF-FF-FF-FF" Service-Type = Framed-User Called-Station-Id = "AA-AA-AA-AA-AA-AA:wireless"

Tue Jan 5 13:03:21 2016 Packet-Type = Access-Request NAS-Port-Id = "AA111/2" Calling-Station-Id = "FF-FF-FF-FF-FF-FF" Service-Type = Framed-User User-Name = "bob@bob.com"

What ideally I'd like to do is have a match rule that has every possible field listed as optional. This would ensure that all lines matched IF they were in the same order, but the fact that they are not is where I'm stuck. Does anyone know a method that I can use to achieve this without having to resort to having a match that looks like (field_a|field_b|field_c) (field_a|field_b|field_c)? (field_a|field_b|field_c)? (field_a|field_b|field_c)?.

Cheers


(Magnus B├Ąck) #2

The kv filter would be ideal for this, except that it doesn't support multi-character separators between key and value (and in your case you have a space on each side of the equal sign). But, perhaps you could use the mutate filter's gsub option to replace all occurrences of " = " with plain "=" and feed that to the kv filter?


(Luke Whitworth) #3

Cheers, I'll have a look into it and see how far I get :slight_smile:


(Luke Whitworth) #4

Absolutely spot on cheers, got it all working with:

filter {
  # This removes empty lines from the logs
  if [type] == "radiusauth" and [message] =~ /^\s*$/ {
    drop {
    }
  }
  # Now we match using white space at the start of the line to signify it belongs to the line before, i.e. indented lines are continuations of the line before
  if [type] == "radiusauth" {
    multiline {
      pattern => "^\s"
        what => "previous"
    }
    # Sanitisize the lines
    mutate {
      gsub => [
        "message", " = ", "=",
        "message", "\"", ""
      ]
    }
    grok {
      match => { "message" => "(?m)%{DAY:radiusauth_day} %{MONTH:radiusauth_month}%{SPACE}%{MONTHDAY:radiusauth_monthday} %{TIME:radiusauth_time} %{YEAR:radiusauth_year}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
      add_field => [ "radiusauth_timestamp", "%{radiusauth_month} %{radiusauth_monthday} %{radiusauth_time}"]
      add_tag => [ "radiusauth_auth"]
    }
    kv {
      source => "message"
      field_split => "\s"
    }
    date {
      match => [ "radiusauth_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

Very much appreciated Sir


(Luke Whitworth) #5

After a little bit of testing the following works better as using \s splits out lines with legitimate double spaces (e.g. some date records)

kv {
      source => "message"
      field_split => "\r\n"
    }

(system) #6