Nginx-ingress grok expression does not handle multiple upstreams

If nginx-ingress retries multiple upstreams, the grok expression does not parse it correctly and manifested as missing data when we knew errors were happening.

This is on filebeat 7.8.0, but issue does not appear to be resolved in 7.9.0.

Sanitised log output:

2 upstreams attempted

some.domain 172.0.0.0 - - [24/Aug/2020:01:30:24 +0000] "GET https://some.domain/some/path HTTP/1.1" 101 10 "-" "Mozilla/5.0 (Linux; Android 10; SM-XXXX) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Mobile Safari/537.36" 1475 143.855 [upstream-name] [] 10.0.0.1:443, 10.0.0.2:443 0, 0 0.100, 143.757 502, 101 c2c58cd42cb68822aae7d640ddf6583a

3 upstreams attempted

some.domain 172.0.0.0 - - [24/Aug/2020:01:28:53 +0000] "GET https://some.domain/some/path HTTP/2.0" 200 28 "https://switter.at/" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36" 60 0.683 [upstream-name] [] 10.0.0.1:443, 10.0.0.2:443, 10.0.0.3:443 0, 0, 39 0.096, 0.092, 0.496 502, 502, 200 d0f69894df9d4b231eca7773a6759ba2

The response code the client sees, as well as the last upstream should be used as http.response.status_code and nginx.ingress_controller.upstream.ip/port, but I'm not super sure how the other status codes and upstreams should be emitted. I don't think it should be thrown away, however.

Hey @chendo welcome to discuss :slight_smile:

This looks like an issue with current pipeline, could you please create a new issue? It'd be great if you can provide some example logs in the issue.

Thanks for reporting!

Issue filed at https://github.com/elastic/beats/issues/20813

1 Like