Filebeat 6.1 in Kubernetes - unable to fetch logs from new pods

thesilence · January 22, 2018, 3:52pm

Filebeat version: 6.1.2
Kubernetes nodes are running on Google's GKE
Kubernetes version: 1.8.5-gke.0

Context

We are following the instructions in https://www.elastic.co/guide/en/beats/filebeat/6.1/running-on-kubernetes.html
We are using almost the same Kubernetes manifest file https://raw.githubusercontent.com/elastic/beats/6.1/deploy/kubernetes/filebeat-kubernetes.yaml

The difference is just the output, instead of ElasticSearch, we are using Kafka, creating a new topic for each app label on Kubernetes, the diff is:

-    output.kafka: 
-      hosts: ["brokers.kafka.svc.cluster.local:9092"]
-      topic: '%{[kubernetes.labels.app]}'
+    processors:
+      - add_cloud_metadata:
+
+    cloud.id: ${ELASTIC_CLOUD_ID}
+    cloud.auth: ${ELASTIC_CLOUD_AUTH}
+
+    output.elasticsearch:
+      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
+      username: ${ELASTICSEARCH_USERNAME}
+      password: ${ELASTICSEARCH_PASSWORD}

The Problem
New pods' logs are not being picked up by Filebeat. They are only picked up if I delete the Filebeat pods, once they get recreated by the DaemonSet conditions, new logs are picked up.

As an example, if we create a sample Deployment in Kubernetes with 10 replica pods, writing messages to stdout:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: log-test
  labels:
    app: my-custom-log

We should be seeing a new topic in Kafka by the name of log-test, however the logs for those new pods are not picked up, so no topic is created:

$ kubectl exec kclient -n kafka -- /usr/bin/kafka-topics --zookeeper zookeeper:2181 --list | grep my-custom-log
$

However, once I shutdown one of the filebeat nodes:

$ kubectl delete pod -n kube-system filebeat-sf9kz
pod "filebeat-sf9kz" deleted

It gets recreated:

filebeat-sf9kz   1/1       Terminating   0         4m
filebeat-sf9kz   0/1       Terminating   0         4m
filebeat-twlgc   0/1       Pending   0         0s
filebeat-twlgc   0/1       ContainerCreating   0         0s
filebeat-twlgc   1/1       Running   0         1s

And now the topic is there as expected:

$ kubectl exec kclient -n kafka -- /usr/bin/kafka-topics --zookeeper zookeeper:2181 --list | grep my-custom-log
my-custom-log

Is there anything we can do to fix this situation?

Thanks!

exekias · January 22, 2018, 4:04pm

Hi @thesilence, could you share the output of any of the not-working Filebeat containers? I'm also interested in the one that works after recreation

thesilence · January 22, 2018, 4:17pm

Hi @exekias thanks a lot for the blazing fast reply!

This is the log of one of the filebeat containers that is not working after creating new pods: https://pastebin.com/Dqvg08Fx

And this is the log of the new one that got created after I deleted the first one: https://pastebin.com/1uQeD6zG

Let me know if you would need further information, I'll be happy to share it

Many thanks!!

exekias · January 22, 2018, 5:04pm

Hi @thesilence,

It's strange, I don't see any error in the output, I'm wondering if the topic: '%{[kubernetes.labels.app]}' could be causing some issue, could you do a quick test after removing that? I'm trying to discard stalls caused by an empty value for kubernetes.labels.app

Best regards

thesilence · January 22, 2018, 5:15pm

Hello @exekias,

thanks for your reply. I have made the following change:

topic: test01
#topic: '%{[kubernetes.labels.app]}'

and they do seem to get picked up, so this is almost probably an issue on the kafka output then.

All the logs are being sent now to test01 topic (including from new created pods that are constantly writing sample logs into stdout), see https://pastebin.com/HmsKvcQN

However, if you take a look at the metadata on those JSON logs, you can see that the label kubernetes.labels.app actually exists:

"labels":{"app":"deploy01","pod-template-hash":"4035586195"},

but it is not picked up in the topic: '%{[kubernetes.labels.app]}' configuration section.

Is there anything you can think of that we might be missing here?

Also remember that if I recreate the pod, the topic with the expected label deploy01 gets instantly created, so this seems to be a problem with the kafka output at runtime.

What do you think?

Many thanks again for the cooperation, I really appreciate the help!

exekias · January 22, 2018, 5:24pm

Hi, thank you for your feedback, we detailed reports like this!

So this is what I think it's happening: For some log events, metadata is not in place. This can happen, especially when we read logs from old containers that are no longer running, we cannot retrieve info about them from Kubernetes, so they are sent unannotated.

I think you can detect that situation and set a default topic for those, while using the label for the rest, using topics setting, it allows you to define a set of rules:

    topic: 'default'
    topics:
      - topic: '%{[kubernetes.labels.app]}'
        when: 
          regexp:
            kubernetes.labels.app: '.*'

thesilence · January 22, 2018, 5:45pm

@exekias no problem, always happy to provide as much detail as possible!

x100 for your help, that config change just made the trick!!!

Now everything is working great, and on top of that, we will be able to identify non-app tagged pods in our deployments.

Many thanks again, it feels great to have this setup working like a charm now

system · February 19, 2018, 5:46pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat is not shipping all pod logs for deployment on kubernetes Beats filebeat	1	1063	May 1, 2019
Kubernetes - Filebeat stops sending/picking up logs Beats filebeat	17	6594	June 26, 2018
Filebeat not harvesting anything in Kubernetes Beats filebeat	5	1492	May 15, 2019
Filebeat stops sending logs after kubernetes deployment restart Beats filebeat	10	2996	September 14, 2018
Filebeat with Kubernetes on GCE Beats	7	934	July 8, 2018

Filebeat 6.1 in Kubernetes - unable to fetch logs from new pods

Related topics