we have a custom nginx log that looks like this
172.16.23.132 - [172.16.23.132] - - [30/Apr/2020:20:49:22 +0000] \"GET /health HTTP/2.0\" 200 7 \"-\" \"curl/7.58.0\" 52 0.013 [foobar-apiservice-443] 172.24.12.90:9001 7 0.014 200 foobar
and my grok filter looks like this
%{IPORHOST:client_ip} - \[%{IPORHOST:x_forwarded_for}\] - %{DATA:remote_user} \[%{HTTPDATE:timestamp}\] \\"(?:%{WORD:verb} %{DATA:request}(?: HTTP/%{NUMBER:httpversion})?|-)\\" %{NUMBER:response} (?:%{NUMBER:bytes_sent;long}|-) \\"%{DATA:referrer}\\" \\"%{DATA:agent}\\" %{NUMBER:request_length;long} %{NUMBER:response_time} \[%{DATA:upstream_proxy}\] %{IPORHOST:upstream_addr}:%{POSINT:clientport} %{NUMBER:upstream_response_length} %{NUMBER:upstream_response_time} %{NUMBER:upstream_response} %{NOTSPACE:ingress_namespace}
Logs are from containers running in a kubernetes cluster. filebeat reads the docker logs and sends them to logstash. I thought filebeat might be decoding the log incorrectly because the double quotes are getting escape characters. The escape characters are present in the docker logs on each worker.
I can run this through a couple online grok debuggers and it passes. when I put it in my logstash config the logs get tagged with _grokparsefailure. Is there something I am missing here? How can I go about troubleshooting which field is causing the problem?