Filebeat 7.4.0 does not recover when it fails to connect with k8s API

I am using the filebeat elastic helm chart https://github.com/elastic/helm-charts/tree/master/filebeat under an Istio service mesh.

As of filebeat 7.4.0 with the new k8s client Update kubernetes watcher to use official client-go libraries by vjsamuel · Pull Request #13051 · elastic/beats · GitHub filebeat starts faster than the Istio side car which blocks outbound requests to the k8s API. As a result filebeat never recovers the k8s connection and I lose all k8s meta data on my log packets.

│ 2019-10-17T20:33:29.733Z ERROR kubernetes/util.go:85 kubernetes: Querying for pod failed with error: Get https://10.100.0.1:443/api/v1/namespaces/bootstrap/pods/bootstrap-filebeat-x-pj8n9: dial tcp 10.100.0.1:443: connect: connection refused │
│ E1017 20:33:29.734464 1 reflector.go:125] github.com/elastic/beats/libbeat/common/kubernetes/watcher.go:235: Failed to list *v1.Pod: Get https://10.100.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resour
│ ceVersion=0: dial tcp 10.100.0.1:443: connect: connection refused

I found a workaround by applying a sleep before starting filebeat, but I don't want it to be permanent.

A similar issue is describe in Kubernetes autodiscover provider fails silently if it can't connect to k8s · Issue #13081 · elastic/beats · GitHub

Will there be a retry or exponential back off added in upcoming versions?

Hi @GreenKnight15,

let me try to reproduce / follow the code.
I'm not sure of if beats have a policy of retrying or failing, looks like it is mostly mandated by the library being used.

If you feel so, open an issue where that behaviour can be discussed while I guess out how it currently works.

Hi again @GreenKnight15 ,

there are 2 potential features affected by this issue

  • autodiscover
  • add kubernetes metadata
  • a watcher that is setup for kubernetes metricsets to enrich events

I think we can do something there, but we need the to get the product designers involved, can you please create a GH issue?

thanks!

I wend ahead and opened a GH ticket

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.