Dynamic Parsing with Grok


#1

Hi I have the following log file sample:

#Version: 1.0
#Fields: s-dns date time x-duration c-ip c-port c-vx-zone
ac1.lg9ams1d1.cdn 2019-01-14 05:00:01 0.001 172.30.116.146 46684 cdn
ac1.lg9ams1d1.cdn 2019-01-14 05:00:01 0.001 172.30.116.146 59064 cdn

and at the moment I am doing static parsing with grok csv based on tab separator.

Problem:

The log file struture changes depending on the software version running on the server and this struture is given by the line #fields coming in beginning of each log file (see the sample above).
My idea is to grab the line fields of each log file and do the mapping accordingly but I am not sure how to do it with grok. Anyone can help?

Thanks in advance.

Ps: I am not a very experienced ELK person


(Tek Chand) #2

@HugoSousa,

Please try the below filter for your above log pattern:

(?<s-dns>[\w\.\d]+)\s(?<date>[\d\-]+)\s(?<time>[\d\:]+)\s(?<duration>[\d\.]+)\s(?<c-ip>[\d\.]+)\s(?<port>\d+)\s(?<vx-zone>\w+)

You can write other pattern also please read about regex. Then you can have basic idea how you can write regex for a log type.

Kindly let me know if you have any other question.

Thanks.


#3

Hi @Tek_Chand, thanks for the quick response.

Just some doubts:

  • How this search for the log line fields and get the fields from there?
  • If in the 1st log file you have #Fields: s-dns date time x-duration c-ip c-port c-vx-zone and in the second log file the you have #Fields: s-dns new-field-1 date time x-duration c-ip c-port c-vx-zon new-field-2, so in fact you have 2 extra fields, the same regex will not work, right?

For instance I would like the mapping to happen dinamically like

-log file 1
#Fields: s-dns date time x-duration c-ip c-port c-vx-zone
ac1.lg9ams1d1.cdn 2019-01-14 05:00:01 0.001 172.30.116.146 46684 cdn

so,
s-dns: ac1.lg9ams1d1.cdn
date: 2019-01-14
time: 05:00:01
x-duration: 0.001
c-ip: 172.30.116.146
c-port: 46684
c-vx-zone: cdn

-log file 2
#Fields: s-dns new-field-1 date time x-duration c-ip c-port c-vx-zone new-field-2
ac1.lg9ams1d1.cdn blabla1 2019-01-14 05:00:01 0.001 172.30.116.146 46684 cdn blabla2

so,
s-dns: ac1.lg9ams1d1.cdn
new-field-: blabla1
date: 2019-01-14
time: 05:00:01
x-duration: 0.001
c-ip: 172.30.116.146
c-port: 46684
c-vx-zone: cdn
new-field-2: blabla2

Cause as you see in the example the number of fields changed from the first log file to the second one.

Thanks

---my filter now is a very basic thing---

if [source] =~ /httpaccess/ {
csv {
separator => " "
columns => ["s_dns", "req_Date", "time", "x_duration", "c_ip", "c_port", "c_vx_zone"]
}


(Tek Chand) #4

@HugoSousa,

When filebeat send the logs to logstash then logstash will read the logs and parse through your regex. Your log will be pass through the regex pattern which match your logs and the same fields will appear on the kibana dashboard.

Yes, same regex will not work for your second log type. You need to write another regex for 2nd log type with the new fields.

Currently you are using CSV to parse the logs, but i gave you regex, using regex you can add your custom fields.

Hope so above point will clear your questions.

Kindly let me know if you still have any other question.

Thanks.


#5

@Tek_Chand,

The thing is that I dont know how many fields will come on each log file, therefore I have to get the line fields (in the beggining of each log file) and do the mapping accordingly...ideally what was I looking for is a "magic" that can do that, rather than having to add several regex for each case. Besides this machine is getting logs with the same naming convention from servers with diferent software version...

Do you have any hint how this is possible to do?

Thanks


(Tek Chand) #6

@HugoSousa,

I don't know any such magic which automatically read your log and add field accordingly.

Thanks.


(system) closed #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.