Grokparsefailure with nginx logs

we have a custom nginx log that looks like this

172.16.23.132 - [172.16.23.132] - - [30/Apr/2020:20:49:22 +0000] \"GET /health HTTP/2.0\" 200 7 \"-\" \"curl/7.58.0\" 52 0.013 [foobar-apiservice-443] 172.24.12.90:9001 7 0.014 200 foobar

and my grok filter looks like this

%{IPORHOST:client_ip} - \[%{IPORHOST:x_forwarded_for}\] - %{DATA:remote_user} \[%{HTTPDATE:timestamp}\] \\"(?:%{WORD:verb} %{DATA:request}(?: HTTP/%{NUMBER:httpversion})?|-)\\" %{NUMBER:response} (?:%{NUMBER:bytes_sent;long}|-) \\"%{DATA:referrer}\\" \\"%{DATA:agent}\\" %{NUMBER:request_length;long} %{NUMBER:response_time} \[%{DATA:upstream_proxy}\] %{IPORHOST:upstream_addr}:%{POSINT:clientport} %{NUMBER:upstream_response_length} %{NUMBER:upstream_response_time} %{NUMBER:upstream_response} %{NOTSPACE:ingress_namespace}

Logs are from containers running in a kubernetes cluster. filebeat reads the docker logs and sends them to logstash. I thought filebeat might be decoding the log incorrectly because the double quotes are getting escape characters. The escape characters are present in the docker logs on each worker.

I can run this through a couple online grok debuggers and it passes. when I put it in my logstash config the logs get tagged with _grokparsefailure. Is there something I am missing here? How can I go about troubleshooting which field is causing the problem?

My answer to that is here. Start with a pattern that matches the first field on a line, then add one field at a time.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.