Mutate and Gsub advice

Hi All,

I have event logs streaming from my SEIM to ES. The format of these logs are LEEF and have "|"'s and "=" signs that delimit the fields. What I'd like to is to use a filter to find and replace the "|"'s and "=" signs with commas. Essentially converting the event stream into a csv format the the rest of the config file can deal with.

Will the following work and can I wildcard the field? Is there a better way within logstash to do this? (keeping in mind that I can't change the format of the event stream at the SEIM)

filter {
mutate {
gsub => [
# replace all bars or =’s with commas
"*", "| | =”, "," # can I wildcard the field?
]
}
}

+++++++++++++++
Sample event data
+++++++++++++++

Event as it comes into log stash:

<14>Apr 3 11:11:08 USHERC001-1-PA-M100-1 LEEF:1.0|Palo Alto Networks|PAN-OS Syslog Integration|8.0.7|allow|cat=TRAFFIC|DeviceName=HKHUB1001-1-PA5050-1|ReceiveTime=2018/04/03 11:11:08|SerialNumber=007801004070|Type=TRAFFIC|subtype=end|devTime=Apr 03 2018 18:11:08 GMT|src=217.237.150.145|dst=218.213.81.158|srcPostNAT=0.0.0.0|dstPostNAT=0.0.0.0|RuleName=INTERNET-ADVDNS|usrName=|SourceUser=|DestinationUser=|Application=dns|SourceZone=INTERNET|DestinationZone=DMZ-VLAN55|IngressInterface=ethernet1/3|EgressInterface=vlan.55|LogForwardingProfile=Forward_to_Panorama|srcPort=52890|dstPort=53|srcPostNATPort=0|dstPostNATPort=0|proto=udp|action=allow|totalBytes=263|dstBytes=162|srcBytes=101|totalPackets=2|StartTime=2018/04/03 11:10:33|ElapsedTime=30|URLCategory=any|SourceLocation=Germany|DestinationLocation=Hong Kong|dstPackets=1|srcPackets=1|SessionEndReason=aged-out

Event after log stash processes and passes it to ES:

<14>Apr 3 11:11:08 USHERC001-1-PA-M100-1 LEEF:1.0,Palo Alto Networks,PAN-OS Syslog Integration,8.0.7,allow,cat,TRAFFIC,DeviceName,HKHUB1001-1-PA5050-1,ReceiveTime,2018/04/03 11:11:08,SerialNumber,007801004070,Type,TRAFFIC,subtype,end,devTime,Apr 03 2018 18:11:08 GMT,src,217.237.150.145,dst,218.213.81.158,srcPostNAT,0.0.0.0,dstPostNAT,0.0.0.0,RuleName,INTERNET-ADVDNS,usrName,,SourceUser,,DestinationUser,,Application,dns,SourceZone,INTERNET,DestinationZone,DMZ-VLAN55,IngressInterface,ethernet1/3,EgressInterface,vlan.55,LogForwardingProfile,Forward_to_Panorama,srcPort,52890,dstPort,53,srcPostNATPort,0,dstPostNATPort,0,proto,udp,action,allow,totalBytes,263,dstBytes,162,srcBytes,101,totalPackets,2,StartTime,2018/04/03 11:10:33,ElapsedTime,30,URLCategory,any,SourceLocation,Germany,DestinationLocation,Hong Kong,dstPackets,1,srcPackets,1,SessionEndReason,aged-out

Regards
TimW

This does not look like a CSV file at all (and actually CEF/LEEF are not constructed that way), it's actually a header + a kv body with | being the field delimited and = being the value delimiter.

A combination of grok (or dissect) and kv should suffice for the above, like e.g. :

filter {
    grok {
        match => {"message" => "<%{INT}>%{MONTH} %{MONTHDAY} %{TIME} %{NOTSPACE:serial} LEEF:%{NOTSPACE}\|%{DATA:vendor}\|%{DATA:product}\|%{DATA:version}\|%{WORD:action}\|%{GREEDYDATA:info}"}
    }

    kv {
        source => "info"
        field_split => "|"
    }
}

There's also a community LEEF codec available.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.