Timestamp taking current time in case of a _grokparsefailure, want log time instead


My log file looks like:

mydomain.com - - [04/Dec/2013:07:35:39 -0600] "GET /images/icon.png HTTP/1.1" 200 358 "https://mydomain.com/welcome" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.7222; .NET CLR 2.0.45827; .NET CLR 3.0.4416.2152; .NET CLR 3.5.31789; InfoPath.2; .NET4.0C; .NET4.0E)" "image/png" 198B848D254D5B7FE2501CGBF29067HA.mydomain 1892

And the grok is:

grok {
match => {
"message" => '%{NOTSPACE:domainname} %{IPORHOST:clientip} %{NOTSPACE:username} %{NOTSPACE:auth} [%{HTTPDATE:timestamp}] "%{NOTSPACE:method} %{NOTSPACE:uri} HTTP/%{NOTSPACE:httpversion}" %{NOTSPACE:status} %{NUMBER:bytes:int} %{QS:referrer} %{QS:agent} %{QS:format} %{NOTSPACE:somedata} %{NUMBER:responsetime:int}'

And, the date filter is as below:

match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss -0600" ]

This parses the log fine. But, for some reason, if there is "_grokparsefailure" for any of the rows, then for that row, timestamp is getting updated as current timestamp(Not the log timestamp).

How to update the timestamp with time-in-the-log even if there is grokparsefailure?

Please suggest.

Depending on which part of the expression that is causing the grok parsing failure, you can either add multiple patterns in order to correctly handle alternative formats or split your grok processing into 2 steps: the first step extracts the fields up to and including the data (use GREEDYDATA to capture the rest of the line into a field, and then process this field using separate grok patterns(s). Either way should ensure that you always parse the timestamp field so that the date filter works.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.