Registry blows up after filebeat container restarts

Hello there,

I've already read a lot of docs and issues but haven't found any solution for my problem, so hopefully someone can help...

Our implementation is pretty simple - filebeat is set as daemonset for kubernetes cluster which means that each node has the filebeat pod which gets logs of the containers with specific label and forwards them to elastic

The configuration of filebeat is as the following

 setup.ilm:
        enabled: false
      setup.template:
        enabled: false
        name: "%{[kubernetes][labels][elastic-index]}"
        pattern: "%{[kubernetes][labels][elastic-index]}-*"
      processors:
      - decode_json_fields:
          fields: ["message"]
          process_array: true
          max_depth: 10
          target: ""
          overwrite_keys: false
          add_error_key: false
      - add_cloud_metadata: ~

      filebeat.inputs:
      - type: container
        paths:
        - '/var/lib/docker/containers/*/*.log'
        processors:
        - add_kubernetes_metadata:
            in_cluster: true
        - drop_event:
            when:
              not:
                regexp:
                  kubernetes.labels.elastic-index: ".*"

The problem we have is that when filebeat pod get restarted(OOM for example), it goes into some "crazy mode" - it updates registry file every second

4167578795.json  active.dat  log.json  meta.json
sh-4.2# ls
4167601926.json  active.dat  log.json  meta.json
sh-4.2# ls
4167625055.json  active.dat  log.json  meta.json
sh-4.2# ls
4167648187.json  active.dat  log.json  meta.json
sh-4.2# ls
4167648187.json  active.dat  log.json  meta.json

and that file contains thousands of lines

sh-4.2# cat 4170146226.json | wc
   2705    2706 1147578

but only for few log files

cat 4170146226.json | jq -r '.[].source' | sort -u | wc
      60      60    9960

This is the output from the node of the same k8s cluster but in "right" mode

4675893.json  active.dat  log.json  meta.json
sh-4.2# ls
4675893.json  active.dat  log.json  meta.json
sh-4.2# ls
4675893.json  active.dat  log.json  meta.json
sh-4.2# ls
4675893.json  active.dat  log.json  meta.json
sh-4.2# ls
4675893.json  active.dat  log.json  meta.json
sh-4.2# cat 4675893.json | wc
     35      36   14383

And for sure the logs from the node with this "crazy" filebeat pod are lost.

We're using

image: docker.elastic.co/beats/filebeat
  imageTag: 7.10.1

Is there any idea how to debug it? (besides increasing the memory limits for the pod to prevent it from getting killed)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.