[Metricbeat] Can't apply rollover when creating indexes

Hello,

I have a metricbeat deployed in a kubernetes cluster, I'm trying to create, for each kubernetes namespace, a metricbeat index following this pattern: metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}-{now/d}-000001. This is the configuration I did:

    output.elasticsearch:
      hosts: ['elasticsearch:9200']
      username: "user"
      password: "dontchangeme"
      protocol: "https"
      ssl.enabled: true
      ssl.verification_mode: none
      index: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"

    setup.dashboards.enabled: true

    # Uncomment below lines for metricbeat segregation
    # set ilm template name (use the default one pushed from metricbeat)
    setup.template.name: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"
    setup.template.pattern: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}*"
    #setup.template.enabled: false
    setup.ilm.enabled: false
    #setup.template.overwrite: true
    # setup ilm policy for customized metricbeat indices
    setup.ilm.policy_name: "metricbeat"
    setup.ilm.rollover_alias: 'metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}'
    setup.ilm.pattern: "{now/d}-000001"

What I expect:

I am expecting to have, for each namespace an index created with ilm and rollover applied

What I am getting

the above configuration create an index for each namespace but without applying the rollover, and the template is not created for each index.
For all index created I have the same error message when doing POST indexname/_ilm/explain:
index.lifecycle.rollover_alias [metricbeat-7.17.1] does not point to index [metricbeat-7.17.1-my-namespace]

My questions are:

  1. First, what I am doing is it feasible?
  2. Why the above parameters exists if we don't get what we want
  3. how to configure rollover automatically from metricbeat
  4. I tried to create an index template, delete the existing index to apply new template with proper rollover alias value, but metricbeat doesn't create the index

hi @marone

Short answer no AFAICT you can not do what you want with 7.17.x the way you want to but I think may be able to with 8.x with data.

Let me show you what I can / understand.

Question for later are you really trying to create a completely separate metricbeat template (field, datatypes settings) for each namespace? If so let's get back to that later. I am going to assume you want to use the metricbeat template lets first solve the rollover issue.

in 7.17.x note I am showing the settings necessary settings

first and foremost you not enabling the ilm you have set so those settings will not be used.

setup.ilm.enabled: false
needs to be
setup.ilm.enabled: true

Ok so let look at that.

So here are my full settings

setup.template.enabled : false
setup.ilm.enabled: true
setup.ilm.policy_name: "metricbeat"
setup.ilm.rollover_alias: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"
setup.ilm.pattern: "{now/d}-000001"

output.elasticsearch:
  hosts: ["localhost:9200"]
  # Note I did not set the index because the rollover write alias will be used automatically 

if you run

metricbeat setup -e or metricbeat -e

you will get

2022-06-10T07:19:47.030-0700    ERROR   instance/beat.go:1014   Exiting: failed to read the ilm rollover alias: key not found
Exiting: failed to read the ilm rollover alias: key not found

That is because it appears filebeat does not know the {[kubernetes.namespace]} at setup time / initial config before the harvesting starts. I tried other fields like host.name or agent.hostname etc none of those work either, the do not appear to be available at initialization

if you put a static value in there

setup.template.enabled : false
setup.ilm.enabled: true
setup.ilm.policy_name: "metricbeat"
setup.ilm.rollover_alias: "metricbeat-%{[agent.version]}-my-static-identifier"
setup.ilm.pattern: "{now/d}-000001"

output.elasticsearch:
  hosts: ["localhost:9200"]
  # Note I did not set the index because the rollover write alias will be used automatically 

Everything works as desired (well without the kubernetes namespace)

GET _cat/indices/me*?v
health status index                                                    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   metricbeat-7.17.3-my-static-identifier-2022.06.10-000001 -QdUswKcSXuY7GkWTy_Uig   1   1        444            0    525.8kb        525.8kb
GET metricbeat-7.17.3-my-static-identifier-2022.06.10-000001/_ilm/explain
#! Elasticsearch built-in security features are not enabled. Without authentication, your cluster could be accessible to anyone. See https://www.elastic.co/guide/en/elasticsearch/reference/7.17/security-minimal-setup.html to enable security.
{
  "indices" : {
    "metricbeat-7.17.3-my-static-identifier-2022.06.10-000001" : {
      "index" : "metricbeat-7.17.3-my-static-identifier-2022.06.10-000001",
      "managed" : true,
      "policy" : "metricbeat",
      "lifecycle_date_millis" : 1654872214297,
      "age" : "2.71m",
....

So it does not look like you can do what you are trying to do in 7.17.x

I think you can do it with data streams in 8.x ... let me know if you want me to show you how I think it will work in 8.x but if you are not going to upgrade then no need.

Hope this helps just a bit...

1 Like

Hello @stephenb,

Thanks for your answer was doubting the same and you confirmed that, for the likely solution in v8, yes it would be nice as we will migrate in a few months to version 8.x.

@marone

in 8.2.2 this is what I got to work.

I do not have a kubernetes cluster so I used host.name which is hyperion

setup.template.enabled : false
setup.ilm.enabled: true
setup.ilm.policy_name: "metricbeat"
setup.ilm.rollover_alias: "metricbeat-%{[agent.version]}-%{[host.name]}"
setup.ilm.pattern: "{now/d}-000001"

output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["localhost:9200"]
  index: "metricbeat-%{[agent.version]}-%{[host.name]}" <!-- Data Stream Name

This worked... It seems that perhaps the meta data is more available in 8.x

Created Data Stream

metricbeat-8.2.2-hyperion

Created underlying index
.ds-metricbeat-8.2.2-hyperion-2022.06.10-000001

Rollover worked when manually did it.

POST metricbeat-8.2.2-hyperion/_rollover

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "old_index" : ".ds-metricbeat-8.2.2-hyperion-2022.06.10-000001",
  "new_index" : ".ds-metricbeat-8.2.2-hyperion-2022.06.10-000002",
  "rolled_over" : true,
  "dry_run" : false,
  "conditions" : { }
}

So that all appears to work.

1 word of caution if you have many namespaces you may suffer index and shard explosion (meaning many many which can lead to suffering cluster performance). Handling a large number of shards is getting better... but still may no be super efficient.

1 Like

thanks a lot @stephenb for your answer, I'll try to apply your answer once we migrate to v8 or try to test it in local

@stephenb

Well I spent the whole day today to make it work but I couldn't.

I set up Elasticsearch, kibana and metricbeat v8.2.2 as you did above on docker desktop.

  1. I couldn't collect metrics at first place I dowlonaded metricbeat using curl -L -O https://raw.githubusercontent.com/elastic/beats/8.2/deploy/kubernetes/metricbeat-kubernetes.yaml and only updated namespace. And deployed kube-stats-metrics as well, but getting following error:
    {"log.level":"error","@timestamp":"2022-06-14T15:09:12.461Z","log.origin":{"file.name":"module/wrapper.go","file.line":254},"message":"Error fetching data for metricset kubernetes.system: error doing HTTP request to fetch 'system' Metricset data: error making http request: Get \"https://docker-desktop:10250/stats/summary\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)","service.name":"metricbeat","ecs.version":"1.6.0"}

kube-stats-metrics is working fine, here are the logs:

I0614 14:37:35.500515       1 server.go:93] Using default resources
I0614 14:37:35.500650       1 types.go:136] Using all namespace
I0614 14:37:35.500681       1 server.go:122] Metric allow-denylisting: Excluding the following lists that were on denylist:
W0614 14:37:35.500704       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0614 14:37:35.571949       1 server.go:250] Testing communication with server
I0614 14:37:35.754692       1 server.go:255] Running with Kubernetes cluster version: v1.22. git version: v1.22.5. git tree state: clean. commit: 5c99e2ac2ff9a3c549d9ca665e7bc05a3e18f07e. platform: linux/amd64
I0614 14:37:35.754761       1 server.go:257] Communication with server successful
I0614 14:37:35.755702       1 server.go:202] Starting metrics server: [::]:8080
I0614 14:37:35.755986       1 metrics_handler.go:96] Autosharding disabled
I0614 14:37:35.757064       1 builder.go:232] Active resources: certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments
I0614 14:37:35.801428       1 server.go:191] Starting kube-state-metrics self metrics server: [::]:8081
I0614 14:37:35.834605       1 server.go:66] levelinfomsgTLS is disabled.http2false
I0614 14:37:35.834680       1 server.go:66] levelinfomsgTLS is disabled.http2false
  1. Even I ignored the previous error (1.), I did the same as you did to get index per namespace, this is the configuration I did for output.elasticsearch:
    setup.template.enabled : false
    setup.ilm.enabled: true
    setup.ilm.policy_name: "metricbeat"
    setup.ilm.rollover_alias: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"
    setup.ilm.pattern: "{now/d}-000001"

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
      index: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"

but was getting the following errors:

{"log.level":"error","@timestamp":"2022-06-14T15:22:14.788Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":176},"message":"failed to publish events: temporary bulk send failure","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-14T15:22:16.053Z","log.origin":{"file.name":"module/wrapper.go","file.line":254},"message":"Error fetching data for metricset kubernetes.volume: error doing HTTP request to fetch 'volume' Metricset data: error making http request: Get \"https://docker-desktop:10250/stats/summary\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-14T15:22:16.631Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":176},"message":"failed to publish events: temporary bulk send failure","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-14T15:22:18.070Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":176},"message":"failed to publish events: temporary bulk send failure","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-14T15:22:20.050Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":176},"message":"failed to publish events: temporary bulk send failure","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-14T15:22:21.690Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":176},"message":"failed to publish events: temporary bulk send failure","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-14T15:22:23.112Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":176},"message":"failed to publish events: temporary bulk send failure","service.name":"metricbeat","ecs.version":"1.6.0"}

I tested that on local docker v20.10.14 and windows [version 10.0.19042.1706].

That looks like metricbeat can not connect with kube-state-metrics nothing to do with data stream setup.

Although I'm not sure that your accessing the kubernetes namespace and the index name is always going to work, but perhaps it will.

For your current issue perhaps I open a separate thread.

Pretty easy way to check it is just exec into the metricbeat container and see if you can access the kube-state-metrics endpoint I suspect you won't be able to reach it. Perhaps your docker containers are on the same network... Not really my area of expertise.

BTW it is
kube-state-metric
Not
kube-stats-metrics

I would try this on a normal metrics set like system first... kube-state-metrics always seem a bit tricky to me.

Me either I tried several stuff but still getting the same error :confused: I don't think it's coming from kube-state-metrics as the error not pointing to kube-state-metrics endpoint or port, maybe something to do with docker-desktop or metricbeat itself (misconfiguration maybe). I'll open another discuss for this problem to get metrics than continue to have one index per namespace.

From inside metricbeat pod, I could get results from the endoint docker-desktop:

token=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
curl -H "Authorization: Bearer $token" https://${HOSTNAME}:10250/stats/summary --insecure
# and with NODE_NAME worked too
curl -H "Authorization: Bearer $token" https://${NODE_NAME}:10250/stats/summary --insecure

and always getting the error:

{"log.level":"error","@timestamp":"2022-06-16T13:25:35.006Z","log.origin":{"file.name":"module/wrapper.go","file.line":254},"message":"Error fetching data for metricset kubernetes.volume: error doing HTTP request to fetch 'volume' Metricset data: HTTP error 500 in : 500 Internal Server Error","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-16T13:25:35.009Z","log.logger":"kubernetes.node","log.origin":{"file.name":"node/node.go","file.line":95},"message":"HTTP error 500 in : 500 Internal Server Error","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-16T13:25:35.009Z","log.logger":"kubernetes.pod","log.origin":{"file.name":"pod/pod.go","file.line":92},"message":"HTTP error 500 in : 500 Internal Server Error","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-16T13:25:35.010Z","log.logger":"kubernetes.container","log.origin":{"file.name":"container/container.go","file.line":91},"message":"HTTP error 500 in : 500 Internal Server Error","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-16T13:25:55.010Z","log.origin":{"file.name":"module/wrapper.go","file.line":254},"message":"Error fetching data for metricset kubernetes.system: error doing HTTP request to fetch 'system' Metricset data: error making http request: Get \"https://docker-desktop:10250/stats/summary\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)","service.name":"metricbeat","ecs.version":"1.6.0"}

I notice you're putting that dash –-insecure in your curl, but I suspect you do not have the equivalent in the metricbeat config, so perhaps it may be failing on the self signed cert.

Can you share your actual metricbeat config?

hmm good catch, I picked the code from this discussion,

here is the configuration of metricbeat:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-config
  namespace: elk
  labels:
    k8s-app: metricbeat
data:
  metricbeat.yml: |-
    metricbeat.config.modules:
      # Mounted `metricbeat-daemonset-modules` configmap:
      path: ${path.config}/modules.d/*.yml
      # Reload module configs as they change:
      reload.enabled: false

    metricbeat.autodiscover:
      providers:
        - type: kubernetes
          scope: cluster
          node: ${NODE_NAME}
          # In large Kubernetes clusters consider setting unique to false
          # to avoid using the leader election strategy and
          # instead run a dedicated Metricbeat instance using a Deployment in addition to the DaemonSet
          unique: true
          templates:
            - config:
                - module: kubernetes
                  hosts: ["kube-state-metrics:8080"]
                  period: 10s
                  add_metadata: true
                  metricsets:
                    - state_node
                    - state_deployment
                    - state_daemonset
                    - state_replicaset
                    - state_pod
                    - state_container
                    - state_job
                    - state_cronjob
                    - state_resourcequota
                    - state_statefulset
                    - state_service
                  # If `https` is used to access `kube-state-metrics`, uncomment following settings:
                  # bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  # ssl.certificate_authorities:
                  #   - /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
                - module: kubernetes
                  metricsets:
                    - apiserver
                  hosts: ["https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"]
                  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  ssl.certificate_authorities:
                    - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  period: 30s
                # Uncomment this to get k8s events:
                #- module: kubernetes
                #  metricsets:
                #    - event
        # To enable hints based autodiscover uncomment this:
        # - type: kubernetes
        #  node: ${NODE_NAME}
        #  hints.enabled: true

    processors:
      - add_cloud_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    # setup.template.enabled : false
    # setup.ilm.enabled: true
    # setup.ilm.policy_name: "metricbeat"
    # setup.ilm.rollover_alias: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"
    # setup.ilm.pattern: "{now/d}-000001"

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
      # index: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-modules
  namespace: elk
  labels:
    k8s-app: metricbeat
data:
  system.yml: |-
    - module: system
      period: 10s
      metricsets:
        - cpu
        - load
        - memory
        - network
        - process
        - process_summary
        #- core
        #- diskio
        #- socket
      processes: ['.*']
      process.include_top_n:
        by_cpu: 5      # include top 5 processes by CPU
        by_memory: 5   # include top 5 processes by memory

    - module: system
      period: 1m
      metricsets:
        - filesystem
        - fsstat
      processors:
      - drop_event.when.regexp:
          system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)'
  kubernetes.yml: |-
    - module: kubernetes
      metricsets:
        - node
        - system
        - pod
        - container
        - volume
      period: 10s
      host: ${NODE_NAME}
      hosts: ["https://${NODE_NAME}:10250"]
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      ssl.verification_mode: "none"
      # If there is a CA bundle that contains the issuer of the certificate used in the Kubelet API,
      # remove ssl.verification_mode entry and use the CA, for instance:
      #ssl.certificate_authorities:
        #- /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
    - module: kubernetes
      metricsets:
        - proxy
      period: 10s
      host: ${NODE_NAME}
      hosts: ["localhost:10249"]
      # If using Red Hat OpenShift should be used this `hosts` setting instead:
      # hosts: ["localhost:29101"]
---
# Deploy a Metricbeat instance per node for node metrics retrieval
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: metricbeat
  namespace: elk
  labels:
    k8s-app: metricbeat
spec:
  selector:
    matchLabels:
      k8s-app: metricbeat
  template:
    metadata:
      labels:
        k8s-app: metricbeat
    spec:
      serviceAccountName: metricbeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: metricbeat
        image: docker.elastic.co/beats/metricbeat:8.2.2
        args: [
          "-c", "/etc/metricbeat.yml",
          "-e",
          "-system.hostfs=/hostfs",
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: elasticsearch
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          value: changeme
        - name: ELASTIC_CLOUD_ID
          value:
        - name: ELASTIC_CLOUD_AUTH
          value:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/metricbeat.yml
          readOnly: true
          subPath: metricbeat.yml
        - name: data
          mountPath: /usr/share/metricbeat/data
        - name: modules
          mountPath: /usr/share/metricbeat/modules.d
          readOnly: true
        - name: proc
          mountPath: /hostfs/proc
          readOnly: true
        - name: cgroup
          mountPath: /hostfs/sys/fs/cgroup
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: cgroup
        hostPath:
          path: /sys/fs/cgroup
      - name: config
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-config
      - name: modules
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-modules
      - name: data
        hostPath:
          # When metricbeat runs as non-root user, this directory needs to be writable by group (g+w)
          path: /var/lib/metricbeat-data
          type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: metricbeat
subjects:
- kind: ServiceAccount
  name: metricbeat
  namespace: elk
roleRef:
  kind: ClusterRole
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat
  namespace: elk
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: elk
roleRef:
  kind: Role
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat-kubeadm-config
  namespace: elk
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: elk
roleRef:
  kind: Role
  name: metricbeat-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: metricbeat
  labels:
    k8s-app: metricbeat
rules:
- apiGroups: [""]
  resources:
  - nodes
  - namespaces
  - events
  - pods
  - services
  verbs: ["get", "list", "watch"]
# Enable this rule only if planing to use Kubernetes keystore
#- apiGroups: [""]
#  resources:
#  - secrets
#  verbs: ["get"]
- apiGroups: ["extensions"]
  resources:
  - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources:
  - statefulsets
  - deployments
  - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources:
  - jobs
  - cronjobs
  verbs: ["get", "list", "watch"]
- apiGroups:
  - ""
  resources:
  - nodes/stats
  verbs:
  - get
- nonResourceURLs:
  - "/metrics"
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat
  # should be the namespace where metricbeat is running
  namespace: elk
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat-kubeadm-config
  namespace: elk
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metricbeat
  namespace: elk
  labels:
    k8s-app: metricbeat
---

Note
ssl.verification_mode: "none"

when I commented ssl.verification_mode: "none", and specified the CA as follow:

# ssl.verification_mode: "none"
      # If there is a CA bundle that contains the issuer of the certificate used in the Kubelet API,
      # remove ssl.verification_mode entry and use the CA, for instance:
      ssl.certificate_authorities:
        # - /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
        - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt

I have the following error:

{"log.level":"error","@timestamp":"2022-06-16T15:02:44.787Z","log.logger":"kubernetes.pod","log.origin":{"file.name":"pod/pod.go","file.line":92},"message":"error making http request: Get \"https://docker-desktop:10250/stats/summary\": x509: certificate signed by unknown authority","service.name":"metricbeat","ecs.version":"1.6.0"}

Checked the path where CA was and changed that to /var/run/secrets/kubernetes.io/serviceaccount/ca.crt. Even when adding ssl.verification_mode: "none" with ssl.certificate_authorities to ignore the check, I have the same error: error making http request: Get \"https://docker-desktop:10250/stats/summary\": net/http: request canceled (Client.Timeout exceeded while awaiting headers). I feel like I'm looping :confused: between errors

Sorry, I can't read your code. It's not formatted correctly.

If you put the

ssl.verification_mode: "none"

you don't put in the certificate, It's one or the other

Also, it needs to be indented properly or will be ignored. I can't tell from your code because it's just a snippet and I can't see where it's falling in with the rest of the configuration.

You're close. You're having an SSL issue at this point I would think

sorry didn't pay attention to the copy and paste and indeed the code is well-formatted here is the full code:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-config
  namespace: elk
  labels:
    k8s-app: metricbeat
data:
  metricbeat.yml: |-
    metricbeat.config.modules:
      # Mounted `metricbeat-daemonset-modules` configmap:
      path: ${path.config}/modules.d/*.yml
      # Reload module configs as they change:
      reload.enabled: false

    metricbeat.autodiscover:
      providers:
        - type: kubernetes
          scope: cluster
          node: ${NODE_NAME}
          # In large Kubernetes clusters consider setting unique to false
          # to avoid using the leader election strategy and
          # instead run a dedicated Metricbeat instance using a Deployment in addition to the DaemonSet
          unique: true
          templates:
            - config:
                - module: kubernetes
                  hosts: ["kube-state-metrics:8080"]
                  period: 10s
                  add_metadata: true
                  metricsets:
                    - state_node
                    - state_deployment
                    - state_daemonset
                    - state_replicaset
                    - state_pod
                    - state_container
                    - state_job
                    - state_cronjob
                    - state_resourcequota
                    - state_statefulset
                    - state_service
                  # If `https` is used to access `kube-state-metrics`, uncomment following settings:
                  # bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  # ssl.certificate_authorities:
                  #   - /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
                - module: kubernetes
                  metricsets:
                    - apiserver
                  hosts: ["https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"]
                  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  ssl.certificate_authorities:
                    - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  period: 30s
                # Uncomment this to get k8s events:
                #- module: kubernetes
                #  metricsets:
                #    - event
        # To enable hints based autodiscover uncomment this:
        # - type: kubernetes
        #  node: ${NODE_NAME}
        #  hints.enabled: true

    processors:
      - add_cloud_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    # setup.template.enabled : false
    # setup.ilm.enabled: true
    # setup.ilm.policy_name: "metricbeat"
    # setup.ilm.rollover_alias: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"
    # setup.ilm.pattern: "{now/d}-000001"

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
      # index: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-modules
  namespace: elk
  labels:
    k8s-app: metricbeat
data:
  system.yml: |-
    - module: system
      period: 10s
      metricsets:
        - cpu
        - load
        - memory
        - network
        - process
        - process_summary
        #- core
        #- diskio
        #- socket
      processes: ['.*']
      process.include_top_n:
        by_cpu: 5      # include top 5 processes by CPU
        by_memory: 5   # include top 5 processes by memory

    - module: system
      period: 1m
      metricsets:
        - filesystem
        - fsstat
      processors:
      - drop_event.when.regexp:
          system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)'
  kubernetes.yml: |-
    - module: kubernetes
      metricsets:
        - node
        - system
        - pod
        - container
        - volume
      period: 10s
      host: ${NODE_NAME}
      hosts: ["https://${NODE_NAME}:10250"]
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      # ssl.verification_mode: "none"
      # If there is a CA bundle that contains the issuer of the certificate used in the Kubelet API,
      # remove ssl.verification_mode entry and use the CA, for instance:
      ssl.certificate_authorities:
        # - /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
        - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    - module: kubernetes
      metricsets:
        - proxy
      period: 10s
      host: ${NODE_NAME}
      hosts: ["localhost:10249"]
      # If using Red Hat OpenShift should be used this `hosts` setting instead:
      # hosts: ["localhost:29101"]
---
# Deploy a Metricbeat instance per node for node metrics retrieval
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: metricbeat
  namespace: elk
  labels:
    k8s-app: metricbeat
spec:
  selector:
    matchLabels:
      k8s-app: metricbeat
  template:
    metadata:
      labels:
        k8s-app: metricbeat
    spec:
      serviceAccountName: metricbeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: metricbeat
        image: docker.elastic.co/beats/metricbeat:8.2.2
        args: [
          "-c", "/etc/metricbeat.yml",
          "-e",
          "-system.hostfs=/hostfs",
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: elasticsearch
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          value: changeme
        - name: ELASTIC_CLOUD_ID
          value:
        - name: ELASTIC_CLOUD_AUTH
          value:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/metricbeat.yml
          readOnly: true
          subPath: metricbeat.yml
        - name: data
          mountPath: /usr/share/metricbeat/data
        - name: modules
          mountPath: /usr/share/metricbeat/modules.d
          readOnly: true
        - name: proc
          mountPath: /hostfs/proc
          readOnly: true
        - name: cgroup
          mountPath: /hostfs/sys/fs/cgroup
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: cgroup
        hostPath:
          path: /sys/fs/cgroup
      - name: config
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-config
      - name: modules
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-modules
      - name: data
        hostPath:
          # When metricbeat runs as non-root user, this directory needs to be writable by group (g+w)
          path: /var/lib/metricbeat-data
          type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: metricbeat
subjects:
- kind: ServiceAccount
  name: metricbeat
  namespace: elk
roleRef:
  kind: ClusterRole
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat
  namespace: elk
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: elk
roleRef:
  kind: Role
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat-kubeadm-config
  namespace: elk
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: elk
roleRef:
  kind: Role
  name: metricbeat-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: metricbeat
  labels:
    k8s-app: metricbeat
rules:
- apiGroups: [""]
  resources:
  - nodes
  - namespaces
  - events
  - pods
  - services
  verbs: ["get", "list", "watch"]
# Enable this rule only if planing to use Kubernetes keystore
#- apiGroups: [""]
#  resources:
#  - secrets
#  verbs: ["get"]
- apiGroups: ["extensions"]
  resources:
  - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources:
  - statefulsets
  - deployments
  - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources:
  - jobs
  - cronjobs
  verbs: ["get", "list", "watch"]
- apiGroups:
  - ""
  resources:
  - nodes/stats
  verbs:
  - get
- nonResourceURLs:
  - "/metrics"
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat
  # should be the namespace where metricbeat is running
  namespace: elk
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat-kubeadm-config
  namespace: elk
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metricbeat
  namespace: elk
  labels:
    k8s-app: metricbeat
---

This error

{"log.level":"error","@timestamp":"2022-06-16T15:02:44.787Z","log.logger":"kubernetes.pod","log.origin":{"file.name":"pod/pod.go","file.line":92},"message":"error making http request: Get \"https://docker-desktop:10250/stats/summary\": x509: certificate signed by unknown authority","service.name":"metricbeat","ecs.version":"1.6.0"}

Says it's still trying to check the CA. And are you getting that exact same area when you put in verification mode none? Only... Do not also add the certificate path

Also, I'm a bit confused Are you trying to do the API server or Kube state metrics or both?
Which one are you focusing on now? You could be failing on one or the other or both.

Perhaps limit the variables and start with one only

What happens if you just try this..

               - module: kubernetes
                  metricsets:
                    - apiserver
                  hosts: ["https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"]
                  ssl.verification_mode: "none"
                  period: 30s

yes I get the same thing when I put only verification mode to none.

Just focusing to get the default ocnfiguration work ^^, I just knew from you that there are two ways to get metrics: API server and kube-state-metrics.

When doing your suggestion (commented the part of kube-state-metrics) with:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-config
  namespace: elk
  labels:
    k8s-app: metricbeat
data:
  metricbeat.yml: |-
    metricbeat.config.modules:
      # Mounted `metricbeat-daemonset-modules` configmap:
      path: ${path.config}/modules.d/*.yml
      # Reload module configs as they change:
      reload.enabled: false

    metricbeat.autodiscover:
      providers:
        - type: kubernetes
          scope: cluster
          node: ${NODE_NAME}
          # In large Kubernetes clusters consider setting unique to false
          # to avoid using the leader election strategy and
          # instead run a dedicated Metricbeat instance using a Deployment in addition to the DaemonSet
          unique: true
          templates:
            - config:
                # - module: kubernetes
                #   hosts: ["kube-state-metrics:8080"]
                #   period: 10s
                #   add_metadata: true
                #   metricsets:
                #     - state_node
                #     - state_deployment
                #     - state_daemonset
                #     - state_replicaset
                #     - state_pod
                #     - state_container
                #     - state_job
                #     - state_cronjob
                #     - state_resourcequota
                #     - state_statefulset
                #     - state_service
                #   # If `https` is used to access `kube-state-metrics`, uncomment following settings:
                #   # bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                #   # ssl.certificate_authorities:
                #   #   - /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
                - module: kubernetes
                  metricsets:
                    - apiserver
                  hosts: ["https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"]
                  ssl.verification_mode: "none"
                  # bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                  # ssl.certificate_authorities:
                  #   - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  period: 30s
                # Uncomment this to get k8s events:
                #- module: kubernetes
                #  metricsets:
                #    - event
        # To enable hints based autodiscover uncomment this:
        # - type: kubernetes
        #  node: ${NODE_NAME}
        #  hints.enabled: true

    processors:
      - add_cloud_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    # setup.template.enabled : false
    # setup.ilm.enabled: true
    # setup.ilm.policy_name: "metricbeat"
    # setup.ilm.rollover_alias: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"
    # setup.ilm.pattern: "{now/d}-000001"

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
      # index: "metricbeat-%{[agent.version]}-%{[kubernetes.namespace]}"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metricbeat-daemonset-modules
  namespace: elk
  labels:
    k8s-app: metricbeat
data:
  system.yml: |-
    - module: system
      period: 10s
      metricsets:
        - cpu
        - load
        - memory
        - network
        - process
        - process_summary
        #- core
        #- diskio
        #- socket
      processes: ['.*']
      process.include_top_n:
        by_cpu: 5      # include top 5 processes by CPU
        by_memory: 5   # include top 5 processes by memory

    - module: system
      period: 1m
      metricsets:
        - filesystem
        - fsstat
      processors:
      - drop_event.when.regexp:
          system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)'
  kubernetes.yml: |-
    - module: kubernetes
      metricsets:
        - node
        - system
        - pod
        - container
        - volume
      period: 10s
      host: ${NODE_NAME}
      hosts: ["https://${NODE_NAME}:10250"]
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      # ssl.verification_mode: "none"
      # If there is a CA bundle that contains the issuer of the certificate used in the Kubelet API,
      # remove ssl.verification_mode entry and use the CA, for instance:
      ssl.certificate_authorities:
        # - /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
        - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    - module: kubernetes
      metricsets:
        - proxy
      period: 10s
      host: ${NODE_NAME}
      hosts: ["localhost:10249"]
      # If using Red Hat OpenShift should be used this `hosts` setting instead:
      # hosts: ["localhost:29101"]
---
# Deploy a Metricbeat instance per node for node metrics retrieval
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: metricbeat
  namespace: elk
  labels:
    k8s-app: metricbeat
spec:
  selector:
    matchLabels:
      k8s-app: metricbeat
  template:
    metadata:
      labels:
        k8s-app: metricbeat
    spec:
      serviceAccountName: metricbeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: metricbeat
        image: docker.elastic.co/beats/metricbeat:8.2.2
        args: [
          "-c", "/etc/metricbeat.yml",
          "-e",
          "-system.hostfs=/hostfs",
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: elasticsearch
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: elastic
        - name: ELASTICSEARCH_PASSWORD
          value: changeme
        - name: ELASTIC_CLOUD_ID
          value:
        - name: ELASTIC_CLOUD_AUTH
          value:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/metricbeat.yml
          readOnly: true
          subPath: metricbeat.yml
        - name: data
          mountPath: /usr/share/metricbeat/data
        - name: modules
          mountPath: /usr/share/metricbeat/modules.d
          readOnly: true
        - name: proc
          mountPath: /hostfs/proc
          readOnly: true
        - name: cgroup
          mountPath: /hostfs/sys/fs/cgroup
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: cgroup
        hostPath:
          path: /sys/fs/cgroup
      - name: config
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-config
      - name: modules
        configMap:
          defaultMode: 0640
          name: metricbeat-daemonset-modules
      - name: data
        hostPath:
          # When metricbeat runs as non-root user, this directory needs to be writable by group (g+w)
          path: /var/lib/metricbeat-data
          type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: metricbeat
subjects:
- kind: ServiceAccount
  name: metricbeat
  namespace: elk
roleRef:
  kind: ClusterRole
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat
  namespace: elk
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: elk
roleRef:
  kind: Role
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metricbeat-kubeadm-config
  namespace: elk
subjects:
  - kind: ServiceAccount
    name: metricbeat
    namespace: elk
roleRef:
  kind: Role
  name: metricbeat-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: metricbeat
  labels:
    k8s-app: metricbeat
rules:
- apiGroups: [""]
  resources:
  - nodes
  - namespaces
  - events
  - pods
  - services
  verbs: ["get", "list", "watch"]
# Enable this rule only if planing to use Kubernetes keystore
#- apiGroups: [""]
#  resources:
#  - secrets
#  verbs: ["get"]
- apiGroups: ["extensions"]
  resources:
  - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources:
  - statefulsets
  - deployments
  - replicasets
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources:
  - jobs
  - cronjobs
  verbs: ["get", "list", "watch"]
- apiGroups:
  - ""
  resources:
  - nodes/stats
  verbs:
  - get
- nonResourceURLs:
  - "/metrics"
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat
  # should be the namespace where metricbeat is running
  namespace: elk
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: metricbeat-kubeadm-config
  namespace: elk
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metricbeat
  namespace: elk
  labels:
    k8s-app: metricbeat
---

I still have the same errors:

{"log.level":"error","@timestamp":"2022-06-16T16:12:45.528Z","log.logger":"kubernetes.node","log.origin":{"file.name":"node/node.go","file.line":95},"message":"error making http request: Get \"https://docker-desktop:10250/stats/summary\": x509: certificate signed by unknown authority","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-16T16:12:45.606Z","log.logger":"kubernetes.container","log.origin":{"file.name":"container/container.go","file.line":91},"message":"error making http request: Get \"https://docker-desktop:10250/stats/summary\": x509: certificate signed by unknown authority","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2022-06-16T16:12:51.549Z","log.origin":{"file.name":"module/wrapper.go","file.line":254},"message":"Error fetching data for metricset kubernetes.apiserver: error getting metrics: unexpected status code 403 from server","service.name":"metricbeat","ecs.version":"1.6.0"}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.