Empty fields in input - grok failure


(Nikhil Pawar) #1

Hi ,

I am trying to use grok for input lines like ;

INPUT DATA

[2017-05-15 00:00:07,751] :|: INFO :|: dubprdsndlbfe33.dub.jabodo.com :|: 26baee2d30164c2083e380b40b2a3b15 :|: [BT:CHROME, BV:55, BL:en, CC:CZ] :|: 81.2.254.161 :|: http://free.internetspeedtracker.com/index.jhtml?partner=^BBQ^xpt184&s1=1212154&s2=306903392384 :|: c.m.w.d.m.UnifiedLoggerWrapper :|: [ET: DLPInfo, IP: 81.2.24.161]
[2017-05-15 00:00:07,877] :|: INFO :|: dubprdsndlbfe33.dub.jabodo.com :|: 22ac17317b93484c89a3da74ff72845b :|: [BT:CHROME, BV:55, BL:en, CC:CZ] :|: 81.2.254.161 :|: http://free.internetspeedtracker.com/index.jhtml?partner=^BBQ^xpt184&s1=1212151&s2=322766181292 :|: c.m.w.d.m.UnifiedLoggerWrapper :|: [ET: DLPInfo, IP: 81.2.54.161]
[2017-05-15 00:00:07,884] :|: INFO :|: dubprdsndlbfe33.dub.jabodo.com :|: 2dbcaad4295249cc87e6e5793c9fa909 :|: [BT:CHROME, BV:55, BL:en, CC:CZ] :|: 81.2.254.161 :|: http://free.internetspeedtracker.com/index.jhtml?partner=^BBQ^xpt184&s1=1212155&s2=306903392824 :|: c.m.w.d.m.UnifiedLoggerWrapper :|: [ET: DLPInfo, IP: 81.2.25.161]
[2017-05-15 00:00:07,965] :|: INFO :|: dubprdsndlbfe33.dub.jabodo.com :|: 5941264e1dfb4711862f5106e9f28087 :|: [BT:CHROME, BV:55, BL:en, CC:CZ] :|: 81.2.254.161 :|: http://free.internetspeedtracker.com/index.jhtml?partner=^BBQ^xpt184&s1=1212147&s2=306903393464 :|: c.m.w.d.m.UnifiedLoggerWrapper :|: [ET: DLPInfo, IP: 81.2.258.161]
[2017-05-15 00:00:07,988] :|: INFO :|: dubprdsndlbfe33.dub.jabodo.com :|: 26baee2d30164c2083e380b40b2a3b15 :|: [BT:CHROME, BV:55, BL:en, CC:CZ] :|: 81.2.254.161 :|: http://free.internetspeedtracker.com/index.jhtml?partner=^BBQ^xpt184&s1=1212154&s2=306903392384 :|: c.m.w.d.m.UnifiedLoggerWrapper :|: [ET: SplashPageServed, IP: 81.2.254.161]

GROK works fine with
input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "(?[[^]]])%{SPACE}:|:%{SPACE}%{WORD:level}%{SPACE}:|:%{SPACE}%{USERNAME:hostname}%{SPACE}:|:%{SPACE}%{GREEDYDATA:coidkey}%{SPACE}:|:%{SPACE}%{GREEDYDATA:clientinfo}%{SPACE}:|:%{SPACE}%{IP:clientIP}%{SPACE}:|:%{SPACE}%{GREEDYDATA:Url}%{SPAC
E}:|:%{SPACE}%{JAVACLASS:class}%{SPACE}:|:%{SPACE}(?[[^]]
])"}
}
}
output {
stdout { codec => rubydebug }
}

but if any field is missing ( Empty field ) in input like ,

[2017-05-11 10:13:33,203] :|: INFO :|: dfprdsndlbfe24.df.jabodo.com :|: :|: [BT:CHROME, BV:58, BL:en, CC:IN] :|: 117.221.64.97 :|: :|: c.m.w.d.m.UnifiedLoggerWrapper :|: - [ET: PageView, IP: 117.221.64.97]
[2017-05-11 10:13:33,308] :|: INFO :|: dfprdsndlbfe24.df.jabodo.com :|: :|: [BT:CHROME, BV:58, BL:en, CC:IN] :|: 117.221.64.97 :|: :|: c.m.w.d.m.UnifiedLoggerWrapper :|: - [ET: PageView, IP: 117.221.64.97]
[2017-05-11 10:05:00,000] :|: INFO :|: dfprdsndlbfe1.df.jabodo.com :|: :|: :|: :|: :|: c.m.w.d.s.InMemoryDataRefreshService :|: - Starting testAllocation refresh
[2017-05-11 10:05:00,006] :|: INFO :|: dfprdsndlbfe1.df.jabodo.com :|: :|: :|: :|: :|: c.m.w.d.s.InMemoryDataRefreshService :|: - Done refreshing testAllocation. It took 6 milliseconds
[2017-05-11 10:05:29,408] :|: INFO :|: dfprdsndlbfe1.df.jabodo.com :|: :|: :|: :|: :|: c.m.s.c.h.HazelcastCacheService :|: - Cache client is connected!
[2017-05-11 10:05:29,651] :|: INFO :|: dfprdsndlbfe1.df.jabodo.com :|: :|: :|: :|: :|: c.m.s.c.h.HazelcastCacheService :|: - Cache client is connected!
[2017-05-11 10:05:33,432] :|: INFO :|: dfprdsndlbfe1.df.jabodo.com :|: :|: :|: :|: :|: c.m.c.PeriodicProductDataRefresher :|: - Beginning Product data refresh
[2017-05-11 10:05:34,492] :|: INFO :|: dfprdsndlbfe1.df.jabodo.com :|: :|: :|: :|: :|: c.m.c.PeriodicProductDataRefresher :|: - Product data is up to date, no refresh was conducted
[2017-05-11 10:10:29,408] :|: INFO :|: dfprdsndlbfe1.df.jabodo.com :|: :|: :|: :|: :|: c.m.s.c.h.HazelcastCacheService :|: - Cache client is connected!

Grok does not work , can someone help on this issue ?

Thanks,
Nikhil


(Nikhil Pawar) #2

I checked (?:%{IP:clientip})? or (?:\s+%{IP:ip}) grok it works but it skips the fields name which is defined . I am trying to display the fieldname even though its empty(i.e there is no value)


(Christian Dahlqvist) #3

This seems to be a csv style format with :|: as a separator. Have you tried using the csv filter instead of grok?


(Nikhil Pawar) #4

I am not familiar of CSV filter.

my grok is working fine but only issues im having is - at the end and the different last field

following grok works fine for input without dash near end field

INPUT :- [2017-05-15 00:00:07,751] :|: INFO :|: dubprdsndlbfe33.dub.jabodo.com :|: 26baee2d30164c2083e380b40b2a3b15 :|: [BT:CHROME, BV:55, BL:en, CC:CZ] :|: 81.2.254.161 :|: http://free.internetspeedtracker.com/index.jhtml?partner=^BBQ^xpt184&s1=1212154&s2=306903392384 :|: c.m.w.d.m.UnifiedLoggerWrapper :|: [ET: DLPInfo, IP: 81.2.24.161]

^(?[[^]]])%{SPACE}:|:%{SPACE}(?:\s+%{WORD:level})?%{SPACE}:|:%{SPACE}(?:\s+%{USERNAME:hostname})?%{SPACE}:|:%{SPACE}(?:\s+%{GREEDYDATA:coidkey})?%{SPACE}:|:%{SPACE}(?:\s+%{GREEDYDATA:clientinfo})?%{SPACE}:|:%{SPACE}(?:\s+%{IP:clientip})?%{SPACE}:|:%{SPACE}(?:\s+%{GREEDYDATA:Url})?%{SPACE}:|:%{SPACE}(?:\s+%{JAVACLASS:class})?%{SPACE}:|:%{SPACE}(?<msg>\[[^]]])$

but there is a - near last field and last field varies , please check following

[2017-05-15 00:00:07,751] :|: INFO :|: dubprdsndlbfe33.dub.jabodo.com :|: 26baee2d30164c2083e380b40b2a3b15 :|: [BT:CHROME, BV:55, BL:en, CC:CZ] :|: 81.2.254.161 :|: http://free.internetspeedtracker.com/index.jhtml?partner=^BBQ^xpt184&s1=1212154&s2=306903392384 :|: c.m.w.d.m.UnifiedLoggerWrapper :|: - [ET: DLPInfo, IP: 81.2.24.161]

[2017-05-09 10:22:00,000] :|: INFO :|: dfprdsndlbfe1.df.jabodo.com :|: :|: :|: :|: :|: c.m.w.d.s.InMemoryDataRefreshService :|: - Starting messages refresh

can someone please advise ?


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.