What ideally I'd like to do is have a match rule that has every possible field listed as optional. This would ensure that all lines matched IF they were in the same order, but the fact that they are not is where I'm stuck. Does anyone know a method that I can use to achieve this without having to resort to having a match that looks like (field_a|field_b|field_c) (field_a|field_b|field_c)? (field_a|field_b|field_c)? (field_a|field_b|field_c)?.
The kv filter would be ideal for this, except that it doesn't support multi-character separators between key and value (and in your case you have a space on each side of the equal sign). But, perhaps you could use the mutate filter's gsub option to replace all occurrences of " = " with plain "=" and feed that to the kv filter?
Absolutely spot on cheers, got it all working with:
filter {
# This removes empty lines from the logs
if [type] == "radiusauth" and [message] =~ /^\s*$/ {
drop {
}
}
# Now we match using white space at the start of the line to signify it belongs to the line before, i.e. indented lines are continuations of the line before
if [type] == "radiusauth" {
multiline {
pattern => "^\s"
what => "previous"
}
# Sanitisize the lines
mutate {
gsub => [
"message", " = ", "=",
"message", "\"", ""
]
}
grok {
match => { "message" => "(?m)%{DAY:radiusauth_day} %{MONTH:radiusauth_month}%{SPACE}%{MONTHDAY:radiusauth_monthday} %{TIME:radiusauth_time} %{YEAR:radiusauth_year}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
add_field => [ "radiusauth_timestamp", "%{radiusauth_month} %{radiusauth_monthday} %{radiusauth_time}"]
add_tag => [ "radiusauth_auth"]
}
kv {
source => "message"
field_split => "\s"
}
date {
match => [ "radiusauth_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.