How to write values from dynamic kv filter into csv file

Hi I am new to Logstash/ELK and venturing on parsing a complex epm log format. The record has fixed part with fixed number of fields and dynamic part with variable number of fields with (field=value) pairs. The value can be a multi word with spaces.

I am able to split the fixed part into tags, but and the rest into a GREEDYDATA field. Now I need to write the fixed part and the variable number of fields into a CSV file. I am struggling to find a way to extract key and value pairs and write them into CSV file. I don't want to name each field from the kvpairs.

I am trying to use Ruby code (I am not good in ruby either), by trying to split the "kvpairs_raw" into k, v fields and trying to form a new array called "dmsg" with each key name prefix as "msg_k".

Any help is greatly appreciated.

My log record:
Jan 4 04:55:01 20.1.1.56 CEF: 0|McAfee|Web Gateway|v1|200|CONNECT|Low| eventId=3716502633 type=1 start=1483531182000 app=HTTP categorySignificance=/Normal categoryBehavior=/Communicate/Query categoryDeviceGroup=/Application catdt=Web Cache categoryOutcome=/Success categoryObject=/Host/Application/Service art=148331320693 cat=Access Log deviceSeverity=200 rt=1483531187000 shost=C-LVILLAGA1.corp.epm.com.co src=10.4.68.80 sourceZoneURI=/All Zones/ArcSight System/Private Address Space Zones/RFC1918: 10.0.0.0-10.255.255.255 suser=LVILLAGA request:443=encrypted-tbn2.gstatic.com requestMethod=CONNECT requestClientApplication=Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko cnt=25 cs3=1.1 cs5=Allowed cs6=Content Server cn1=0 cs1Label=Virus Name cs2Label=Elapsed Time cs3Label=HTTP Version cs5Label=Block Reason cs6Label=Categories cn1Label=Block Reason ID c6a4Label=Agent IPv6 Address ahost=EPM-AIT75.corp.epm.com.co agt=10.1.1.125 agentZoneURI=/All Zones/ArcSight System/Private Address Space Zones/RFC1918: 10.0.0.0-10.255.255.255 av=7.1.7.7600.0 atz=America/Bogota aid=31UJ0aVIBABCAB0osfW8fqg== at=mcafee_webgateway_file dtz=America/Bogota requestProtocol=encrypted-tbn.gstatic.com _cefVer=0.1

My config:

filter
{
grok {
match => ["message", "\A%{WORD:month} %{NUMBER:day} %{TIME:time} %{IP:ipaddr} %{WORD:cef}: %{NUMBER:ver}|%{WORD:dvendor}|%{DATA:dproduct}|(%{WORD:dversion})?|%{WORD:deventclassid}|%{WORD:dname}|%{WORD:dseverity}| %{GREEDYDATA:kvpairs_raw}"]
}

kv {
#trimkey => "\s"
field_split => " "
value_split => "="
source => "kvpairs_raw"
#target => "kvpairs"
#remove_field => "kvpairs_raw"
}

ruby {
code => "
k, v = event['kvpairs_raw'].split('=')
dmsg['msg_' + k] = v
"
}
}

output
{
stdout { codec=>"rubydebug" }

csv
{
fields=>["month", "day", "time", "ipaddr", "cef", "ver", "dvendor", "dproduct", "dversion", "deventclassid", "dname", "dseverity", "dmsg"]
path=>"/home/ubuntu/giri/data/epmlog1.csv"
}
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.