Is there a way to get a more verbose response than just "_grokparsefailure"

Hello,

I have been using grok filtering for a couple months now and have noticed the pain point of the very vague grok parse failure tag. I am wondering if there is a way for that to give a little bit more detail on where exactly did it fail.

I found this doc : https://www.elastic.co/blog/do-you-grok-grok that was very helpful as it had a very similar log as to the one I am trying to parse out. I have never used the HTTPDATE grok, before or others that are mentioned in the link. I usually create regexes for the data, but have found that Logstash sometimes doesn't like those. I felt pretty confident since it is using all GROK in house patterns, but no luck.

I am trying to parse out this event:

Event:
10.100.200.00 - - [30/Aug/2017:15:31:19 -0400] "GET /some/path/some/path HTTP/1.1" 200 3093 0 9 "https://someurl/some/path/some/log" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.101 Safari/537.36" "image/png" 2CE9B72CBD4D03C457A2893EF60128F6 10.100.200.200 11132

My Grok Pattern:
^%{IP:client_ip}%{SPACE}%{USER:client_port}%{SPACE}%{USER:ident}%{SPACE}[%{HTTPDATE:event_ts}]%{SPACE}"%{WORD:method}%{SPACE}%{DATA:uri_path}%{SPACE}HTTP/%{NUMBER:http_version}"%{SPACE}%{NUMBER:status}%{SPACE}(?:-|%{NUMBER:bytes})%{SPACE}%{NUMBER:request_time_in_secs}%{SPACE}%{NUMBER:keep_alives}%{SPACE}"%{GREEDYDATA:referer}"%{SPACE}"%{GREEDYDATA:user_agent}"%{SPACE}"%{GREEDYDATA:contenttype}"%{SPACE}%{DATA:jsession_id}%{SPACE}%{IP:x_clientip}%{SPACE}%{NUMBER:response_time_in_secs}$

stdout put on a test:
"referer" => "https://someurl/docs/DOC-00000",
"method" => "POST",
"response_time_in_secs" => "97849",
"ident" => "-",
"http_version" => "1.1",
"message" => "10.100.200.00 - - [30/Aug/2017:16:20:47 -0400] "POST /some/path/here HTTP/1.1" 200 109 0 0 "https://someurl/docs/DOC-00000" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "application/json" 2CE9B72CBD4D03C457A2893EF60128F6 10.100.200.200 97849",
"keep_alives" => "0",
"x_clientip" => "10.100.200.200",
"contenttype" => "application/json",
"path" => "/apps/logstash-5.5.0/bin/test.txt",
"client_port" => "-",
"jsession_id" => "3F966A293216B6A0315DF0D1FFCE7AEB",
"@timestamp" => 2017-08-30T20:20:47.000Z,
"uri_path" => "some/path/here",
"bytes" => "109",
"event_ts" => "30/Aug/2017:16:20:47 -0400",
"@version" => "1",
"host" => "myserver",
"request_time_in_secs" => "0",
"client_ip" => "10.100.200.200",
"user_agent" => "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko",
"status" => "200"

My config:
else if ([service] == "myservice"){
if([attributes][file_path] == "some/path/access.log") {
grok {
patterns_dir => [ "/apps/logstash-patterns" ]
match => [ "body", "%{Access}" ]
}
date {
match => ["event_ts" ,"dd/MMM/yyyy:HH:mm:ss Z"]
target => "@timestamp"
timezone => "America/New_York"
}
}

Thanks

I am wondering if there is a way for that to give a little bit more detail on where exactly did it fail.

No, there isn't.

stdout put on a test:

So... what's the problem? If that's what you get things seem to be working.

That's what i thought as well, but I am still getting a _grokparsefailure. Data parses out perfectly in the grok debugger, but not in my environment. We have it going from Rocana -> Kafka -> Logstash -> Elastic -> Kibana.

What do you mean by "stdout put on a test"? It looks like output from Logstash.

I created a test.conf on one of my logstash servers and had it read from a test.txt that contained the events I am trying to parse out. Just to test out that my grok pattern is actually working. Here in the stdout is shows that it is parsing out correctly which was what i pasted up top.

test.conf:

input {
file {
path => "/apps/logstash-5.5.0/bin/test.txt"
start_position => "beginning"

}

}

filter {
grok {
patterns_dir => [ "/apps/grok-patterns" ]
match => [ "message", "^%{IP:client_ip}%{SPACE}%{USER:client_port}%{SPACE}%{USER:ident}%{SPACE}[%{HTTPDATE:event_ts}]%{SPACE}"%{WORD:method}%{SPACE}%{DATA:uri_path}%{SPACE}HTTP/%{NUMBER:http_version}"%{SPACE}%{NUMBER:status}%{SPACE}(?:-|%{NUMBER:bytes})%{SPACE}%{NUMBER:request_time_in_secs}%{SPACE}%{NUMBER:keep_alives}%{SPACE}"%{GREEDYDATA:referer}"%{SPACE}"%{GREEDYDATA:user_agent}"%{SPACE}"%{GREEDYDATA:contenttype}"%{SPACE}%{DATA:jsession_id}%{SPACE}%{IP:x_clientip}%{SPACE}%{NUMBER:response_time_in_secs}$" ]
}
#kv {
# source => "kvpairs"
# field_split => ", "
# value_split => " = "
# remove_field => [ "kvpairs" ]
#}
date {
match => ["event_ts" ,"dd/MMM/yyyy:HH:mm:ss Z"]
target => "@timestamp"
timezone => "America/New_York"
}
}

output {
stdout {
codec => rubydebug {}
}
}

I don't know if this will help to narrow down where the grok failure is taking place...

I added a mutate field to add a new filed ... that actually came through into kibana, but I am still seeing the grokparsefailure....

any thoughts?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.