Filebeat used for k8s autodiscovery and file logs sending duplicated events

Chexpir · May 1, 2020, 11:43am

Hi! I can see "duplicated events" is a very common issue, but I cannot find the solution for the problem I have, that looks like a very generic one IMHO.

I am using filebeat in a kubernetes cluster (taking care of kubernetes autodiscovery and file log extraction), so I have 8 instances created by the daemon set. It looks like that one per node.

It looks like the file log extraction is replicated 8 times, one per node. Is any easy way of solving this?
Else, I can foresee 3 solutions:

I'll need to deploy one instance taking care of everything (if k8s autodiscovery works from one node to all the clusters).
I'll need to keep these 8 instances + 1 specific one for file log extraction.
Stop using add_id processor and using fingerprint processor, but it would be a waste of energy processing and dropping 7 out of the 8 file reads.

ChrsMark · May 5, 2020, 10:12am

Hi @Chexpir!

What do you mean by file log extraction? Could you provide your configuration? Usually each Daemonset autodiscover pods/containers on the node where it runs. I don't see any way to discover containers from a different node .

Regards.

Chexpir · May 5, 2020, 10:32am

@ChrsMark sorry for not being specific enough. I mean I have 2 inputs: k8s autodiscover (which works perfectly) and file log extraction (which sends each event 8 times), and filebeat is deployed with the automatic daemonset (so once per node) and file logs are read once per node.

    filebeat.inputs:
      - type: log
        paths:
          - "/logs*/*/karaf/logs/tesb.log*"
        fields_under_root: true
        multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
        multiline.negate: true
        multiline.match: after
      - type: log
        paths:
          - "/logs*/*/Interfaces/logs/*"
        fields_under_root: true
        multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
        multiline.negate: true
        multiline.match: after
    filebeat.modules:
    - module: activemq
      audit:
        enabled: true
        var.paths: ["/logs*/*/activemq/logs/audit.log*"]
      log:
        enabled: true
        var.paths: ["/logs*/*/activemq/logs/activemq.log*"]
    - module: apache
      access:
        enabled: true
        var.paths: ["/logs*/*/*ui-*/logs/access.log*"]
      error:
        enabled: true
        var.paths: ["/logs*/*/*ui-*/logs/error.log*"]

    filebeat.autodiscover:
      providers:
        - type: kubernetes
          node: ${NODE_NAME}
          hints.enabled: true
          hints.default_config:
            type: container
            paths:
              - /var/log/containers/*${data.kubernetes.container.id}.log

I was waiting for an answer for this, but I believe best solution, if I cannot do autodiscovery from different nodes, is the following
"keep these 8 instances for k8s autodiscover + 1 new specific filebeat instancefor file log extraction"

ChrsMark · May 5, 2020, 11:24am

So these logs from which you see the events are being collected from the host? It seems that yes you need only one Filebeat instance to handle this cluster wide input.

Note that you can always define modules in autodiscover: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover.html#_kubernetes. For instance:

filebeat.autodiscover:
  providers:
    - type: kubernetes
      templates:
        - condition:
            equals:
              kubernetes.container.image: "redis"
          config:
            - module: redis
              log:
                input:
                  type: container
                  paths:
                    - /var/log/containers/*-${data.kubernetes.container.id}.log

system · June 2, 2020, 11:37am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kubernetes autodiscover sending logs from only some of the identical nodes in one of our clusters Beats filebeat	9	1053	January 31, 2019
Filebeat 7.3.0 with kubernetes autodiscover and nginx module creating duplicate log entries Beats filebeat	1	1325	September 21, 2019
Filebeat K8s deployment - Duplicated Filestream ID Beats filebeat	1	218	August 7, 2023
Filebeat autodiscovery in Kubernetes/EKS and multiple outputs Beats	2	624	May 5, 2021
Filebeat duplicate log Beats docker , filebeat	5	1030	April 22, 2020

Filebeat used for k8s autodiscovery and file logs sending duplicated events

Related topics