Some info does not get shipped with MetricsBeat from AWS EKS

Hello there,
As we would have the EKS to be monitored by Elastic, we deploy DaemonSet following this url.

Metricbeat

However some of the metrics cannot be shipped nor displayed on dashboards.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: metricbeat
  namespace: elastic
  labels:
    k8s-app: metricbeat
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: metricbeat
  labels:
    k8s-app: metricbeat
rules:
- apiGroups: [""]
  resources:
  - nodes
  - namespaces
  - events
  - pods
  - services
  - persistentvolumes
  - persistentvolumeclaims
  verbs: ["get", "list", "watch"]
# Enable this rule only if planing to use Kubernetes keystore
#- apiGroups: [""]
#  resources:
#  - secrets
#  verbs: ["get"]
- apiGroups: ["extensions"]
  resources:
  - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources:
  - statefulsets
  - deployments
  - replicasets
  - daemonsets
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources:
  - jobs
  - cronjobs
  verbs: ["get", "list", "watch"]
- apiGroups: ["storage.k8s.io"]
  resources:
    - storageclasses
  verbs: ["get", "list", "watch"]
- apiGroups:
  - ""
  resources:
  - nodes/stats
  verbs:
  - get
- nonResourceURLs:
  - "/metrics"
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat
  # should be the namespace where metricbeat is running
  namespace: elastic
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat-kubeadm-config
  namespace: elastic
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: metricbeat
subjects:
- kind: ServiceAccount
  name: metricbeat
  namespace: elastic
roleRef:
  kind: ClusterRole
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat
  namespace: elastic
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: elastic
roleRef:
  kind: Role
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat-kubeadm-config
  namespace: elastic
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: elastic
roleRef:
  kind: Role
  name: metricbeat-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-config
  namespace: elastic
  labels:
    k8s-app: metricbeat
data:
  metricbeat.yml: |-
    metricbeat.config.modules:
      # Mounted `metricbeat-daemonset-modules` configmap:
      path: ${path.config}/modules.d/*.yml
      # Reload module configs as they change:
      reload.enabled: false

    metricbeat.autodiscover:
      providers:
        - type: kubernetes
          scope: cluster
          node: ${NODE_NAME}
          # In large Kubernetes clusters consider setting unique to false
          # to avoid using the leader election strategy and
          # instead run a dedicated Metricbeat instance using a Deployment in addition to the DaemonSet
          unique: true
          templates:
            - config:
                - module: kubernetes
                  #hosts: ["metrics-server:443"]
                  hosts: ["https://metrics-server.kube-system.svc.cluster.local"]
                  period: 10s
                  add_metadata: true
                  metricsets:
                    - state_node
                    - state_deployment
                    - state_daemonset
                    - state_replicaset
                    - state_pod
                    - state_container
                    - state_job
                    - state_cronjob
                    - state_resourcequota
                    - state_statefulset
                    - state_service
                    - state_persistentvolume
                    - state_persistentvolumeclaim
                    - state_storageclass
                  # If `https` is used to access `kube-state-metrics`, uncomment following settings:
                  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  ssl.certificate_authorities:
                    - /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
                - module: kubernetes
                  metricsets:
                    - apiserver
                  hosts: ["https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"]
                  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  ssl.certificate_authorities:
                    - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  period: 30s
                # Uncomment this to get k8s events:
                - module: kubernetes
                  metricsets:
                    - event
        # To enable hints based autodiscover uncomment this:
        #- type: kubernetes
        #  node: ${NODE_NAME}
        #  hints.enabled: true

    processors:
      - add_cloud_metadata:

    #cloud.id: ${elastic_CLOUD_ID}
    #cloud.auth: ${elastic_CLOUD_AUTH}

    output.elasticsearch:
      hosts: ${elasticSEARCH_HOST}
      username: ${elasticSEARCH_USERNAME}
      password: ${elasticSEARCH_PASSWORD}
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-modules
  namespace: elastic
  labels:
    k8s-app: metricbeat
data:
  system.yml: |-
    - module: system
      period: 10s
      metricsets:
        - cpu
        - load
        - memory
        - network
        - process
        - process_summary
        #- core
        #- diskio
        #- socket
      processes: ['.*']
      process.include_top_n:
        by_cpu: 5      # include top 5 processes by CPU
        by_memory: 5   # include top 5 processes by memory
      processors:
      - add_fields:
          target: orchestrator
          fields:
            cluster.name: dcp-eks-d2chk-nonprod
    - module: system
      period: 1m
      metricsets:
        - filesystem
        - fsstat
      processors:
      - add_fields:
          target: orchestrator
          fields:
            cluster.name: dcp-eks-d2chk-nonprod
      - drop_event.when.regexp:
          system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)'
  kubernetes.yml: |-
    - module: kubernetes
      metricsets:
        - node
        - system
        - pod
        - container
        - volume
      period: 10s
      host: ${NODE_NAME}
      hosts: ["https://${NODE_NAME}:10250"]
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      ssl.verification_mode: "none"
      processors:
      - add_fields:
          target: orchestrator
          fields:
            cluster.name: dcp-eks-d2chk-nonprod
      # If there is a CA bundle that contains the issuer of the certificate used in the Kubelet API,
      # remove ssl.verification_mode entry and use the CA, for instance:
      #ssl.certificate_authorities:
      #  - /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
    - module: kubernetes
      metricsets:
        - proxy
      period: 10s
      host: ${NODE_NAME}
      hosts: ["localhost:10249"]
      processors:
      - add_fields:
          target: orchestrator
          fields:
            cluster.name: dcp-eks-d2chk-nonprod
      # If using Red Hat OpenShift should be used this `hosts` setting instead:
      # hosts: ["localhost:29101"]
    - module: kubernetes
      metricsets:
        - event
      period: 30s
    - module: kubernetes
      metricsets:
        - state_node
        - state_daemonset
        - state_deployment
        - state_replicaset
        - state_statefulset
        - state_pod
        - state_container
        - state_job
        - state_cronjob
        - state_resourcequota
        - state_service
        - state_persistentvolume
        - state_persistentvolumeclaim
        - state_storageclass
      #hosts: ["metrics-server:443"]
      hosts: ["https://metrics-server.kube-system.svc.cluster.local"]
      add_metadata: true
---
# Deploy a Metricbeat instance per node for node metrics retrieval
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: metricbeat
  namespace: elastic
  labels:
    k8s-app: metricbeat
spec:
  selector:
    matchLabels:
      k8s-app: metricbeat
  template:
    metadata:
      labels:
        k8s-app: metricbeat
    spec:
      serviceAccountName: metricbeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: metricbeat
        image: docker.elastic.co/beats/metricbeat:8.7.0
        args: [
          "-c", "/etc/metricbeat.yml",
          "-e",
          "-system.hostfs=/hostfs",
        ]
        env:
        - name: elasticSEARCH_HOST
          value: https://xxxx.es.vpce.ap-southeast-1.aws.elastic-cloud.com:443
        - name: elasticSEARCH_PORT
          value: "443"
        - name: elasticSEARCH_USERNAME
          value: elastic
        - name: elasticSEARCH_PASSWORD
          value: xxxx
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 1000Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/metricbeat.yml
          readOnly: true
          subPath: metricbeat.yml
        - name: data
          mountPath: /usr/share/metricbeat/data
        - name: modules
          mountPath: /usr/share/metricbeat/modules.d
          readOnly: true
        - name: proc
          mountPath: /hostfs/proc
          readOnly: true
        - name: cgroup
          mountPath: /hostfs/sys/fs/cgroup
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: cgroup
        hostPath:
          path: /sys/fs/cgroup
      - name: config
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-config
      - name: modules
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-modules
      - name: data
        hostPath:
          # When metricbeat runs as non-root user, this directory needs to be writable by group (g+w)
          path: /var/lib/metricbeat-data
          type: DirectoryOrCreate
---

Thank you!

Hey Jeff.

Your configuration does not look exactly like the suggested one in the URL.
First of all you have enabled state-metrics metricsets twice using autodiscover kubernetes provider and later on as a module.
Problem with enabling it as a module is that each metricbeat instance will query for state-metrics which are cluster wide and will end up to duplicates.
But the problem in your case is that you use metrics-server to fetch the state-metrics. But metrics-servers does not provide those. You need to deploy https://github.com/kubernetes/kube-state-metrics to get those metrics you are missing.

Hi @Michalis_Katsoulis ,

Thanks a lot for your advice, after deploying the kube-state-metrics (instead of the one suggested by AWS), the event are shown.

Also, the cluster name's processor needs to placed under metricbeat.autodiscover.
Screenshot 2023-05-11 at 11.59.19

Thanks again!
Jeff

Hi Jeff! Great.

About the add_fields processor it is needed for now to add the cluster_name to the events. Reason is that this information cannot be retrieved by AWS metadata. So add_cloud_metadata does not add the cluster_name.
We are working on a solution which will fix this and the add_cloud_metadata processor will add the cloud.orchestrator.cluster.name. Coming in 8.9.0 release.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.