Date section of filter generating a _grokparsefailure

Hi,

I'm new to Logstash and I was wondering if someone could point me in the right direction. I'm using Logstash version 2.4 and setup this filter to parse out IIS logs from Azure. If I comment out the data section everything parses correctly but the @timestamp of the record is the import date not the event date (I have to load a lot of old data so it's important that it's by even date). When the date section is enabled so that it should use the event date (logtimestamp) as the event date then I get a Tag _grokparsefailure. None of the fields are parsed and there is not a _dateDidNotMatch tag. I'm not seeing anything in the log files that would indicate an error and I'm not sure what I'm doing wrong... Is there something wrong with my syntax or logic?

-thanks

filter {
# Ignore log comments
if [message] =~ "^#" {
drop {}
}

grok {
    # Check that fields match your IIS log settings
    match => ["message", "%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day} %{TIME:time} %{IPORHOST:site} %{WORD:method} %{URIPATHPARAM:page} %{NOTSPACE:querystring} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientip} %{NOTSPACE:useragent} %{NOTSPACE:cookie} %{NOTSPACE:referer} %{IPORHOST:hostname} %{NUMBER:status} %{NUMBER:substatus} %{NUMBER:winstatus} %{NUMBER:bytes_response} %{NUMBER:bytes_request} %{NUMBER:time_taken}"]

    add_field => {
        "logtimestamp" => "%{year}-%{month}-%{day} %{time}"
    }

    remove_field => [ "year", "month", "day", "time" ]
}

# Set the Event Timesteamp from the log
date {
    match => ["logtimestamp", "yyyy-MM-dd HH:mm:ss"]
    target => "@timestamp"
    timezone => "UTC"
    locale => "en"
    tag_on_failure => ["_dateDidNotMatch"]
}

useragent {
    source => "useragent"
    prefix => "browser"
}

}

It's very hard to believe that enabling the date filter resulting in the grok filter failing.

If you show us what your log looks like maybe we can help.

Thanks for you willingness to help me. Here is an extract from one of our log files. Didn't see a way to upload a log file so If there is a better way, I can send a copy of the whole file.

#Software: Microsoft Internet Information Services 8.0
#Fields: date time s-sitename cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Cookie) cs(Referer) cs-host sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken
2016-12-31 23:59:23 EPICMD-LIVE GET / X-ARR-LOG-ID=87f83ddc-f810-4e33-9114-7d0028df67c5 443 - 104.200.152.54 Mozilla/5.0+(compatible;+monitis+-+premium+monitoring+service;+http://www.monitis.com) - - corpadmin.epic.md 200 0 0 3508 857 31
2016-12-31 23:59:28 EPICMD-LIVE POST /Command/API.V1.GetCustomWebsiteInfo X-ARR-LOG-ID=b72e8976-92e5-4cb0-9a52-c57edc5e47b6 443 - 52.91.132.232 - - - epicmd.corpadmin.directscale.com 200 0 0 956 1132 138
2016-12-31 23:59:29 EPICMD-LIVE POST /Command/API.V1.GetCustomerDetail X-ARR-LOG-ID=6d224616-8281-4351-bc4c-a18fe03b5848 443 - 52.91.132.232 - - - epicmd.corpadmin.directscale.com 200 0 0 1659 1134 96
2016-12-31 23:59:29 EPICMD-LIVE POST /Command/API.V1.GetCustomWebsiteInfo X-ARR-LOG-ID=ad2b38a5-de80-4b67-a4df-63b2b08c355c 443 - 52.91.132.232 - - - epicmd.corpadmin.directscale.com 200 0 0 956 1132 127
2016-12-31 23:59:29 EPICMD-LIVE POST /Command/API.V1.GetCustomerDetail X-ARR-LOG-ID=ef89451a-d607-43a1-9bd9-06e06c17345b 443 - 52.91.132.232 - - - epicmd.corpadmin.directscale.com 200 0 0 1659 1134 93

Your grok expression works fine:

$ cat test.config 
input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
  grok {
    match => ["message", "%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day} %{TIME:time} %{IPORHOST:site} %{WORD:method} %{URIPATHPARAM:page} %{NOTSPACE:querystring} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientip} %{NOTSPACE:useragent} %{NOTSPACE:cookie} %{NOTSPACE:referer} %{IPORHOST:hostname} %{NUMBER:status} %{NUMBER:substatus} %{NUMBER:winstatus} %{NUMBER:bytes_response} %{NUMBER:bytes_request} %{NUMBER:time_taken}"]
  }
}
$ echo '2016-12-31 23:59:23 EPICMD-LIVE GET / X-ARR-LOG-ID=87f83ddc-f810-4e33-9114-7d0028df67c5 443 - 104.200.152.54 Mozilla/5.0+(compatible;+monitis+-+premium+monitoring+service;+http://www.monitis.com) - - corpadmin.epic.md 200 0 0 3508 857 31' | /opt/logstash/bin/logstash -f test.config
Settings: Default pipeline workers: 8
Pipeline main started
{
           "message" => "2016-12-31 23:59:23 EPICMD-LIVE GET / X-ARR-LOG-ID=87f83ddc-f810-4e33-9114-7d0028df67c5 443 - 104.200.152.54 Mozilla/5.0+(compatible;+monitis+-+premium+monitoring+service;+http://www.monitis.com) - - corpadmin.epic.md 200 0 0 3508 857 31",
          "@version" => "1",
        "@timestamp" => "2017-01-31T07:10:41.322Z",
              "host" => "lnxolofon",
              "year" => "2016",
             "month" => "12",
               "day" => "31",
              "time" => "23:59:23",
              "site" => "EPICMD-LIVE",
            "method" => "GET",
              "page" => "/",
       "querystring" => "X-ARR-LOG-ID=87f83ddc-f810-4e33-9114-7d0028df67c5",
              "port" => "443",
          "username" => "-",
          "clientip" => "104.200.152.54",
         "useragent" => "Mozilla/5.0+(compatible;+monitis+-+premium+monitoring+service;+http://www.monitis.com)",
            "cookie" => "-",
           "referer" => "-",
          "hostname" => "corpadmin.epic.md",
            "status" => "200",
         "substatus" => "0",
         "winstatus" => "0",
    "bytes_response" => "3508",
     "bytes_request" => "857",
        "time_taken" => "31"
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}

You can simplify your configuration by capturing the whole timestamp with a single grok pattern, TIMESTAMP_ISO8601. Then you can remove the add_field and remove_field options.

Thanks, that makes more sense...

I also found out what the parse error was. Azure will sometimes place a '~1' in the site name for App Service Plans and the IPORHOST doesn't like that. Because all my dates where in the past, when I had the date section uncommitted I would only see the parse error (because the @timestamp is set to the current date/time) and so it looked like everything was failing. But I found that the rest of the logs where correct, but in the past beyond the scope of my query.

Thanks for your feedback and help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.