Kv split everything after ";"

#1

So I am trying to split the following data format examples:

'process_count'=259;300;400;
'cpu'=3.8%;;;
'available'=27.29GiB;50;57;

I essentially want my key pair to be name and first value, e.g "process_count" => "259", "available" => "27.59"

My code is

	kv {    	  
	  source => "SERVICEPERFDATA"
	  trim_key => "'"
	  remove_char_value => "GiB,%,;"
	  trim_value => ";"
	}

the trim_value doesnt seem to work, and doesnt remove any of the ";" but I also want to delete everything after the first ";" . If I add the ";" into the remove_char_value, it does remove it, but keeps the values after/in between it.

Can anyone suggest how I can achieve this?
Thanks
Kyle

#2

How about

grok { match => { "message" => "^'(?<key>[^']+)'=(?<value>[0-9\.]+).*" } }

I often see folks trying to use grok in use-cases where I think another filter is a better fit. This is a case where I think grok is a great fit :smiley:

1 Like
#3

I might be using this wrong, as I don't know grok at all. Should I be putting this into a loop? As it only retrieves the first value from the field I give it. It also outputs the data into keypairs named "key" and "value". Which I cant replce with predefined one, I need these to be derived from the data.

grok { match => { "SERVICEPERFDATA" => "^'(?<key>[^']+)'=(?<value>[0-9\.]+).*" } }

"SERVICEPERFDATA" => "'available'=5.51GiB;6;6; 'total'=7.00GiB;6;6; 'free'=5.51GiB;6;6; 'used'=1.49GiB;6;6;"
SERVICEPERFDATA:'process_count'=259;300;400; 'cpu'=3.8%;;; 'memory'=18.1%;;; 'memory_vms'=36.44GB;;; 'memory_rss'=1.21GB;;;

Some examples of the data. I just want the value between the ' ' as my keyname, and the first value after.

I've been trying to do regex for ";*" in the KV, but it doesnt seem to work?

	kv {
	  source => "SERVICEPERFDATA"
	  trim_key => "'"
	  field_split => " "
	  remove_char_value => "GiB,%,;*"
	  #trim_value => "\;*"
	}
#4

Well the sample data you included in your question only have a single key and value on each line. For that example of SERVICEPERFDATA the following would work

kv { source => "SERVICEPERFDATA" trim_key => "'" target => "[@metadata][spd]" }
ruby {
    code => '
        event.get("[@metadata][spd]").each { |k, v|
            m = /^[0-9\.]+/.match(v)
            event.set(k, m[0])
        }
    '
}

Error handling is left as an exercise for the reader.

#5

This worked perfect, thanks Badger!

#6

I'm trying to enhance this to put this into an array, with the name from another field. It works for the first element, but doesnt do the rest. Am I doing something stupid on a friday afternoon?

	ruby {
		code => '
			event.get("[@metadata][spd]").each { |k, v|
			m = /^[0-9\.]+/.match(v)
			event.set((event.get("SERVICEDESC")),[Hash[k, m[0]]])				
			}
		'
	}
#7

That would overwrite that field once for each item in spd. This

input { generator { count => 1 lines => [ '' ] } }
filter {
    mutate {
        add_field => {
            "SERVICEDESC" => "joe"
            "SERVICEPERFDATA" => "'available'=5.51GiB;6;6; 'total'=7.00GiB;6;6; 'free'=5.51GiB;6;6; 'used'=1.49GiB;6;6;"
        }
    }
    kv { source => "SERVICEPERFDATA" trim_key => "'" target => "[@metadata][spd]" }
    ruby {
        code => '
            a = []
            event.get("[@metadata][spd]").each { |k, v|
                m = /^[0-9\.]+/.match(v)
                a << Hash[k, m[0]]
            }
            event.set(event.get("SERVICEDESC"), a)
        '
    }
}

will get you

            "joe" => [
    [0] {
        "available" => "5.51"
    },
    [1] {
        "free" => "5.51"
    },
    [2] {
        "total" => "7.00"
    },
    [3] {
        "used" => "1.49"
    }
],

If that's not quite not what you want (and an array of hashes does seem an unlikely requirement) perhaps it will help you get there.

#8

This does work for me, but I was doing a conversion after to chnge the values to floats. When I did this with the hash, and subsequent googling, told me this isnt possible. Myabe if I explain the sitatuons that might help.

I'm parsing nagios flat file, tab delimited log files for performance metrics. The problem being that some of the metric names (such as 'available', 'free') are used across several different metric types (SERVICEDESC). So in kibana, I was having issues using these for visulizations etc.

My intended solution was to nest these in an array of the metric type (SERVICEDESC). So they could then be referenced by metric_type.metric_name.

below is examples of the data

SERVICEDESC:Memory Usage
SERVICEPERFDATA:'available'=27.29GiB;50;57; 'total'=62.89GiB;50;57; 'free'=1.09GiB;50;57; 'used'=34.46GiB;50;57;

SERVICEDESC:Swap Usage
SERVICEPERFDATA:'total'=16.62GiB;13;15; 'used'=3.07GiB;13;15; 'free'=13.55GiB;13;15;

SERVICEDESC:Disk Usage
SERVICEPERFDATA:'used'=3.58GiB;24;27; 'free'=25.91GiB;24;27; 'total'=29.49GiB;24;27;

#9

I think it is more likely you will want to use

filter {
mutate {
    add_field => {
        "SERVICEDESC" => "joe"
        "SERVICEPERFDATA" => "'available'=5.51GiB;6;6; 'total'=7.00GiB;6;6; 'free'=5.51GiB;6;6; 'used'=1.49GiB;6;6;"
    }
}
kv { source => "SERVICEPERFDATA" trim_key => "'" target => "[@metadata][spd]" }
ruby {
    code => '
        h = {}
        event.get("[@metadata][spd]").each { |k, v|
            m = /^[0-9\.]+/.match(v)
            h[k] =  m[0].to_f
        }
        event.set(event.get("SERVICEDESC"), h)
    '
}
}

Then you would be able to refer to [Memory Usage][available] or [Memory Usage][total] rather than [Memory Usage][0][available] and [Memory Usage][1][total].

#10

That seems to work perfectly! Thanks badger