Kubernetes Events MetricBeat filter duplicate entries

Having setup a default ELK stack on Kubernetes cluster I deployed MetricBeat also as default... everything works well, excepts I get duplicated entries for the same data, due to the daemonSet (I have 3 worker nodes on this test cluster).

How to prevent these duplicate entries to enter the index? Is there a way to do this with just the MetricBeat client or does it require the use of logstash?

As a follow up I also see duplicates of other events... since the sampling rate is 10 seconds for example I see the same event like a container CrashLoopBackOff reported multiple times. I would like to avoid that these docs are created on ES, to minimize the amount of data on an index.

Hi!

Event metricseat should be enabled in a k8s Deployment of Metricbeat and not by the Daemonset since its scope is cluster wide. If you are using latest versions of Metirbcreat manifests you can use leadeelection to have only one Metricbeat of the Daemonset enabling this metricset. See beats/metricbeat-kubernetes.yaml at master · elastic/beats · GitHub

Hello Chris, thanks for your reply. This is on a new cluster on AWS EKS, using metricbeat v7.11.2, I use the defaults to install

helm install metricbeat elastic/metricbeat

no custom settings, or SET flags, just the above command. It installs the daemonSet by default. However it always sends the metrics duplicated, so maybe the leader election is not working? Any idea how to check for that ?

Ok I see.

You will need to update helm-charts/values.yaml at master · elastic/helm-charts · GitHub and remove

      - module: kubernetes
        enabled: true
        metricsets:
          - event

from Daemonset config and move it under Deployment config at helm-charts/values.yaml at master · elastic/helm-charts · GitHub.

I think this shouldn't be like this, however it would be nice if you can open a Github issue in the repo so as folks maintaining the charts can evaluate this better.

1 Like

Thanks once more Chris.

Even with that setting I was still getting duplicated entries, in the end the only way I was able to resolve this to only have one event reported was to add the following fingerprint processor

- module: kubernetes
        enabled: true
        metricsets:
          - event
        period: 5s
        hosts: ["${KUBE_STATE_METRICS_HOSTS}"]
        processors:
          - fingerprint:
              fields: ["kubernetes.event.timestamp.first_occurrence","kubernetes.event.count","kubernetes.event.involved_object.name","kubernetes.event.message","kubernetes.event.metadata.namespace"]
              target_field: "@metadata._id"  

I am not sure this is the best approach, but its feasible and does not seem to have bad side effects. I just want to store the kubectl events in a compact way for auditing.

I leave this here, but let me know if there is a better solution.