Grokparse failure - custom tomcat access.log - newbee failure

Hi,

I have the following log line from tomcat access log:

[21/Apr/2015:00:00:02 +0200] - 66.249.78.123 www.mydomain.no A7CEE20E7E0A5F1C5727E0036B868422.agap67 - GET /myuri/foobar/search-result.action?richList=false&projectRows=50&productRows=50&authorRows=50&articleRows=50&otherFacets=ft_academical_subject%3ASamfunnsvitenskap%7C%3B%3Bf_digital_type%3AApp 200 265ms

I have the following grok-filter:

if [type] == "access-log" {
    grok {
        match => [ "message", "\[%{HTTPDATE:timestamp}\] \- %{IP:client} %{HOSTNAME:hostname} %{WORD:jsessionId}.%{WORD:node} - %{WORD:method} %{URIPATHPARAM:request} %{INT:size} %{WORD:responsetime}" ]
    }
}

I've tested the filter with grokDebugger at https://grokdebug.herokuapp.com/, with success, but running it on my testserver I get a "_grokparsefailure".

I run Logstash v. 1.4 in a Docker container.

Need a bit of help getting past this issue. Thanks!

Works for me:

$ cat data
[21/Apr/2015:00:00:02 +0200] - 66.249.78.123 www.mydomain.no A7CEE20E7E0A5F1C5727E0036B868422.agap67 - GET /myuri/foobar/search-result.action?richList=false&projectRows=50&productRows=50&authorRows=50&articleRows=50&otherFacets=ft_academical_subject%3ASamfunnsvitenskap%7C%3B%3Bf_digital_type%3AApp 200 265ms
$ cat test.config
input { stdin { codec => plain } }
output { stdout { codec => rubydebug } }
filter {
  grok {
    match => [
      "message",
      "\[%{HTTPDATE:timestamp}\] \- %{IP:client} %{HOSTNAME:hostname} %{WORD:jsessionId}.%{WORD:node} - %{WORD:method} %{URIPATHPARAM:request} %%{INT:size} %{WORD:responsetime}"
    ]
  }
}
$ /opt/logstash/bin/logstash -f test.config < data
{
         "message" => "<feff>[21/Apr/2015:00:00:02 +0200] - 66.249.78.123 www.mydomain.no A7CEE20E7E0A5F1C5727E0036B868422.agap67 - GET /myuri/foobar/search-result.action?richList=false&projectRows=50&productRows=50&authorRows=50&articleRows=50&otherFacets=ft_academical_subject%3ASamfunnsvitenskap%7C%3B%3Bf_digital_type%3AApp 200 265ms",
        "@version" => "1",
      "@timestamp" => "2015-05-08T09:49:14.347Z",
            "host" => "seldlx20533",
       "timestamp" => "21/Apr/2015:00:00:02 +0200",
          "client" => "66.249.78.123",
        "hostname" => "www.mydomain.no",
      "jsessionId" => "A7CEE20E7E0A5F1C5727E0036B868422",
            "node" => "agap67",
          "method" => "GET",
         "request" => "/myuri/foobar/search-result.action?richList=false&projectRows=50&productRows=50&authorRows=50&articleRows=50&otherFacets=ft_academical_subject%3ASamfunnsvitenskap%7C%3B%3Bf_digital_type%3AApp",
            "size" => "200",
    "responsetime" => "265ms"
}

Thanks for the feedback.

Though, Trying the same as you, I've got the following result. Could it bee a issue with my installation?

$ /opt/logstash/bin/logstash -f test.config < data
{
"message" => "[21/Apr/2015:00:00:02 +0200] - 66.249.78.123 www.mydomain.no A7CEE20E7E0A5F1C5727E0036B868422.agap67 - GET /myuri/foobar/search-result.action?richList=false&projectRows=50&productRows=50&authorRows=50&articleRows=50&otherFacets=ft_academical_subject%3ASamfunnsvitenskap%7C%3B%3Bf_digital_type%3AApp 200 265ms",
"@version" => "1",
"@timestamp" => "2015-05-08T09:57:07.746Z",
"host" => "80eff25bcbe0",
"tags" => [
[0] "_grokparsefailure"
]
}

Resoved the issue:

Working message filter:

> match => [
>       "message", "\[%{HTTPDATE:timestamp}\] %{USERNAME:UserName} (?:%{NOTSPACE:client}|%{IP:client}) %{IPORHOST:hostname} %{WORD:jsessionId}.%{WORD:node} %{USERNAME:remoteUser} %{WORD:method} %{URIPATHPARAM:request} %{INT:http_status} %{WORD:responsetime}"
>     ]
  1. The access log prints '-' if no value is assigned to a log column. I.e if the %u (Remote user that was authenticated) is specified in access.log config, the value '-' is printed if no username given. Using the %{WORD:username} will not work. Used %{USERNAME} istead.
  2. In cases of internal calls to tomcat, clientIP was posted as '-'. Used "(?:%{NOTSPACE:client}|%{IP:client})" to resolve