Logstash using grok filter to modify values


(Michael Green) #1

I haven't used grok filter before and having a little trouble getting the results I need. Ultimately I'm pulling some data into ElasticSearch from a database query using Logstash and the JDBC input plugin. Mostly irrelevant, but what I'm trying to accomplish here is to 'normalise' a version field for more meaningful reporting in Kibana.

For example a product version could be anything on the left, and I want to normalise it to just major and minor version on the right as shown below to report at that level and put that in a different field (such as version_normalised).

Examples:
version -> version_normalised
"10" -> "10.0"
"10.2 -> "10.2"
"10.2(1)" -> "10.2"
"10.2(1)U1" -> "10.2"
"10.2.1.10032" -> "10.2"

So as you can see versions can be expressed in different ways in the incoming data which is splitting reporting up, so we want to try and 'normalise' it. Anyway as best as I can tell this may be best accomplished in grok. I have been doing some testing with some very simple config I have to break up the problem a bit but I'm stuck even just matching multiple numbers.

For example if I have the following it seems to work okay and when I test with a single number "9", "10" for example it sets it to MAJOR as expected:

input {
        stdin { }
}

filter {
	grok {
		match => { "message" => "(?<MAJOR>^\D+)" }
	}
}

output {
        stdout {
                codec => rubydebug
        }
}

But if I try to get multiple matches to try and match a "11.22" for example I just get parse failure no matter what the input:

input {
	stdin { }
}

filter {
	grok {
		match => { "message" => [
			"(?<MAJOR>^\D+)",
			"(?<MAJOR>^\D+)\.(?<MINOR>\D+)" ] }
	}
}

output {
	stdout {
		codec => rubydebug
	}
}

Now I have tried multiple variations also but I didn't want to complicate things too much here in the initial post.

Apologies I'm new to this filter so I may have some follow-up questions also but can't seem to get past some basic matching at this stage. Appreciate any guidance, thank you.


(Michael Green) #2

So I had some success with the following grok configuration:

filter {
        grok {
                match => { "message" => [
                        "(?<version_normalised>^\d+$)",
                        "(?<version_normalised>^\d+\.\d+)" ] }
        }
}

So for example testing with the following values I get the following results for MAJOR and MINOR fields:

message		version_normalised
9			9
10			10
12 34		parsefailed
12.34		12.34
12.34(1)	12.34
12.34(1)U1	12.34

So all these results I expected so this is some good news getting the matching happening.

The outstanding question I have is in the cases where there is just a single number is there a way to append ".0" such that I can transform the following so all are equal:

input -> output
10 -> 10.0
10.0 -> 10.0

Thanks for any guidance in advance.


(Michael Green) #3

I seem to have resolved this now, but as always will just update solution here on the off chance someone else comes along and it helps them...

input {
        stdin { }
}

filter {
        grok {
                match => { "message" => [
                        "(?<version_normalised>^\d+$)",
                        "(?<version_normalised>^\d+\.\d+)" ]
                }
        }
        if [version_normalised] =~ /^\d+$/ {
                mutate {
                        update => { "version_normalised" => "%{version_normalised}.0" }
                }
        }
}

output {
        stdout {
                codec => rubydebug
        }
}

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.