Grok to parse CEF extension fields


(Leandro Maciel) #1

Hello,

I'm trying to create a grok pattern to parse the extension fields in CEF message from an antivirus server.

My problem right now is that the same field can have different types of data, sometimes it is an intenger, other times it is a word, another time it could be a message or a version with major and minor numbers.

And also sometimes I do not have all the fields, but I can use ( )? to make the field optional.

Something like that:

cs2=KES cs2Label=ProductName cs3=10.2.4.0 cs3Label=ProductVersion cs5=Install update - Service Pack 1 MR 2 cs5Label=TaskName cs4=159 cs4Label=TaskId cn2=4 cn2Label=TaskNewState cn1=1 cn1Label=TaskOldState
 cs2=1093 cs2Label=ProductName cs3=1.0.0.0 cs3Label=ProductVersion

If for example I use (cs2=%{WORD:cs2.id})? it will match the first line for the field cs2, but not the second, if I use INT instead of WORD it will match the second line for cs2, If i use DATA nothing is matched and if I use GREEDYDATA, all the message will be in the first field that appears in the message, in this case cs2.

Anyone has any idea how to solve this parsing problem?

For what I saw in the logs the values in the fields can be an intenger, a word, a software version, a message with spaces, a filename, and trying greedydata does not work, since it ignores all the other fields that come after the match.

I'm trying the following pattern for this part of the message, but it is not working:
(it's all in one line, the line breaking is only to better visualization)

(%{SPACE})?(cs1=%{INT:cs1.id})?%{SPACE}(cs1Label=%{WORD:cs1.label})?
(%{SPACE})?(cs2=%{INT:cs2.id})?%{SPACE}(cs2Label=%{WORD:cs2.label})?
(%{SPACE})?(cs3=%{INT:cs3.id})?%{SPACE}(cs3Label=%{WORD:cs3.label})?
(%{SPACE})?(cs4=%{INT:cs4.id})?%{SPACE}(cs4Label=%{WORD:cs4.label})?
(%{SPACE})?(cs5=%{INT:cs5.id})?%{SPACE}(cs5Label=%{WORD:cs5.label})?
(%{SPACE})?(cs6=%{INT:cs6.id})?%{SPACE}(cs6Label=%{WORD:cs6.label})?
(%{SPACE})?(cn1=%{INT:cn1.id})?%{SPACE}(cn1Label=%{WORD:cn1.label})?
(%{SPACE})?(cn2=%{INT:cn2.id})?%{SPACE}(cn2Label=%{WORD:cn2.label})?
(%{SPACE})?(cn3=%{INT:cn3.id})?%{SPACE}(cn3Label=%{WORD:cn3.label})?
(%{SPACE})?(cn4=%{INT:cn4.id})?%{SPACE}(cn4Label=%{WORD:cn4.label})?
(%{SPACE})?(cn5=%{INT:cn5.id})?%{SPACE}(cn5Label=%{WORD:cn5.label})?
(%{SPACE})?(cn6=%{INT:cn6.id})?%{SPACE}(cn6Label=%{WORD:cn6.label})?


(Magnus B├Ąck) #2

Use a kv filter, not grok.


(Leandro Maciel) #3

Oh, thanks!

kv helped a lot, I'm using grok to parse the beginning of the message and the rest I'm using kv, but I'm still having some problems.

How can I keep the spaces in a value since space is also the field separator?

For example:

cs5=Install update - Service Pack 2 cs5Label=TaskName cs4=102 cs4Label=TaskId cn2=1 cn2Label=TaskNewState cn1=0 cn1Label=TaskOldState 

Using kv will give me the value for the cs5 key as only 'Install', but I need the full message, which should be 'Install update - Service Pack 2'

Is there any way to do it using kv? Or I will need to go back to grok and grok each kind of message?


#4

I think you would have to resort to ruby code to parse arbitrary CEF extensions. Basically you would need to step through the extension string one (non-escaped) = at a time. Within the text between two =, work backwards from the end to find the last space, which separates the value from the next keyword.


(Leandro Maciel) #5

Hello,

I solved the problem using a combination of grok and kv.

Since only one of the keys have space in the value, I solved the problem using grok.

The messages are something like the one below:

Jan  5 11:26:21 server.hostname CEF: 0|KasperskyLab|SecurityCenter|10.4.343|KLPRCI_TaskState|Running|1|rt=1515158609 dhost=a-hostname dst=an-ip-address cs2=KES cs2Label=ProductName cs3=10.3.0.0 cs3Label=ProductVersion cs5=Install update - Service Pack 2 cs5Label=TaskName cs4=102 cs4Label=TaskId cn2=1 cn2Label=TaskNewState cn1=0 cn1Label=TaskOldState

I use grok to parse the first part of the message:

Jan  5 11:26:21 server.hostname CEF: 0|KasperskyLab|SecurityCenter|10.4.343|KLPRCI_TaskState|Running|1|

And then I'm using GREEDYDATA to match the remaining into a field, which I use as the source for kv and also as the source for another grok, to match only the field with spaces in the value.

I will look into ruby to see if it is something simple, but right now it is working this way.


#6

Can you share the contents of your conf. I am looking to do the same.

Thanks,
Glenn


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.