Kubernetes Pod and container memory usage always at 0

Hi,
I have an issue on MetricBeat v8.4.2.
I use it to collect Kubernetes (v1.22.2) metrics.
But i always have kubernetes.container.memory.usage.bytes and kubernetes.pod.memory.usage.bytes at 0

Here are my clusterRoles:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: metricbeat
  labels:
    k8s-app: metricbeat
rules:
  - apiGroups: [""]
    resources:
      - nodes
      - namespaces
      - events
      - pods
      - services
    verbs: ["get", "list", "watch"]
  # Enable this rule only if planing to use Kubernetes keystore
  #- apiGroups: [""]
  #  resources:
  #  - secrets
  #  verbs: ["get"]
  - apiGroups: ["extensions"]
    resources:
      - replicasets
    verbs: ["get", "list", "watch"]
  - apiGroups: ["apps"]
    resources:
      - statefulsets
      - deployments
      - replicasets
    verbs: ["get", "list", "watch"]
  - apiGroups: ["batch"]
    resources:
      - jobs
      - cronjobs
    verbs: ["get", "list", "watch"]
  - apiGroups:
      - ""
    resources:
      - nodes/stats
    verbs:
      - get
  - nonResourceURLs:
      - "/metrics"
    verbs:
      - get

If i run this cmd under the metricbeat daemonset pod:
curl -k https://${NODE_NAME}:10250/stats/summary --header "Authorization: Bearer $TOKEN"
I have my pod memory usage:

 {
   "podRef": {
    "name": "my-pod",
    "namespace": "default",
    "uid": "7be45b82-0395-4ae9-a0ab-7c275c121565"
   },
   "startTime": "2022-10-27T15:42:12Z",
   "containers": [
    {
     "name": "postgres",
     "startTime": "2022-10-27T15:42:15Z",
     "cpu": {
      "time": "2022-10-27T16:02:59Z",
      "usageNanoCores": 3272512,
      "usageCoreNanoSeconds": 19228071000
     },
     "memory": {
      "time": "2022-10-27T16:02:59Z",
      "workingSetBytes": 148652032
     },
     "rootfs": {
      "time": "2022-10-27T16:02:52Z",
      "availableBytes": 110666403840,
      "capacityBytes": 135102586880,
      "usedBytes": 196608,
      "inodesFree": 8147061,
      "inodes": 8380416,
      "inodesUsed": 60
     },
     "logs": {
      "time": "2022-10-27T16:02:59Z",
      "availableBytes": 110666403840,
      "capacityBytes": 135102586880,
      "usedBytes": 45056,
      "inodesFree": 8147061,
      "inodes": 8380416,
      "inodesUsed": 1
     }
    }
   ],
   "cpu": {
    "time": "2022-10-27T16:02:58Z",
    "usageNanoCores": 3221979,
    "usageCoreNanoSeconds": 19417139000
   },
   "memory": {
    "time": "2022-10-27T16:02:58Z",
    "availableBytes": 7367151616,
    "usageBytes": 190943232,
    "workingSetBytes": 149041152,
    "rssBytes": 41623552,
    "pageFaults": 803187,
    "majorPageFaults": 0
   },

After small investigation container memory and CPU usage are removed from kubelet here

But on metricbeat, we calculate the pod memory usage based on the container memory usage here.
That's an issue for me to calculate the pod.memory.usageBytes with a non existing metrics. we have to reuse the provided values from kubelet

Related issue: We are not seeing metrics for cpu, memory and network · Issue #31124 · elastic/beats · GitHub

Hello @sebglon ,
I'm here to help with your issue.

Do you mind providing some more info?

  • are you running on a cloud provider?
  • can you provide the entire manifest that you are using to run metricbeat?

I just tried to replicate the issue running locally with a Kind cluster with v1.22.2 and I can correctly see all the cpu and memory metrics from Kubelet

(I have reducted some parts of the stats/summary response just to keep the doc small.

{
    "node": {
        "nodeName": "multi-v1.22.2-worker2",
        "systemContainers": [
            {
                "name": "kubelet",
                "startTime": "2022-11-04T11:35:05Z",
                "cpu": {
                    "time": "2022-11-04T14:46:19Z",
                    "usageNanoCores": 10418606,
                    "usageCoreNanoSeconds": 99418972000
                },
                "memory": {
                    "time": "2022-11-04T14:46:19Z",
                    "usageBytes": 48238592,
                    "workingSetBytes": 47833088,
                    "rssBytes": 44470272,
                    "pageFaults": 95931,
                    "majorPageFaults": 0
                }
            }
        ],
        "startTime": "2022-11-03T13:41:28Z",
        "cpu": {
            "time": "2022-11-04T14:46:16Z",
            "usageNanoCores": 32453576,
            "usageCoreNanoSeconds": 287037524000
        },
        "memory": {
            "time": "2022-11-04T14:46:16Z",
            "availableBytes": 13316526080,
            "usageBytes": 424001536,
            "workingSetBytes": 266895360,
            "rssBytes": 148414464,
            "pageFaults": 3386811,
            "majorPageFaults": 37
        }
    },
    "pods": [
        {
            "podRef": {
                "name": "metricbeat-runner-j452h",
                "namespace": "kube-system",
                "uid": "719592ed-ff36-4bda-86f7-ebfb8f9104a7"
            },
            "startTime": "2022-11-04T11:39:55Z",
            "containers": [
                {
                    "name": "metricbeat",
                    "startTime": "2022-11-04T11:39:56Z",
                    "cpu": {
                        "time": "2022-11-04T14:46:18Z",
                        "usageNanoCores": 1603302,
                        "usageCoreNanoSeconds": 17067928000
                    },
                    "memory": {
                        "time": "2022-11-04T14:46:18Z",
                        "availableBytes": 165335040,
                        "usageBytes": 48029696,
                        "workingSetBytes": 44380160,
                        "rssBytes": 32710656,
                        "pageFaults": 70554,
                        "majorPageFaults": 0
                    }
                }
            ],
            "cpu": {
                "time": "2022-11-04T14:46:04Z",
                "usageNanoCores": 2155980,
                "usageCoreNanoSeconds": 17062689000
            },
            "memory": {
                "time": "2022-11-04T14:46:04Z",
                "availableBytes": 165560320,
                "usageBytes": 47804416,
                "workingSetBytes": 44154880,
                "rssBytes": 32034816,
                "pageFaults": 71478,
                "majorPageFaults": 0
            }
        }
    ]
}

Elastic (v8.4.2) with the Kubernetes module is correctly reporting those metrics.

So please provide more info so that I can replicate the issue. Thanks

Also do you mind providing the output of the following command

15:40:19 › kubectl get nodes -o wide

NAME                          STATUS   ROLES                  AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION      CONTAINER-RUNTIME
multi-v1.22.2-control-plane   Ready    control-plane,master   4h7m   v1.22.2   172.19.0.4    <none>        Ubuntu 21.10   5.10.124-linuxkit   containerd://1.5.7-13-g9d0acfe46
multi-v1.22.2-worker          Ready    <none>                 4h6m   v1.22.2   172.19.0.3    <none>        Ubuntu 21.10   5.10.124-linuxkit   containerd://1.5.7-13-g9d0acfe46
multi-v1.22.2-worker2         Ready    <none>                 4h6m   v1.22.2   172.19.0.5    <none>        Ubuntu 21.10   5.10.124-linuxkit   containerd://1.5.7-13-g9d0acfe46
multi-v1.22.2-worker3         Ready    <none>                 4h6m   v1.22.2   172.19.0.2    <none>        Ubuntu 21.10   5.10.124-linuxkit   containerd://1.5.7-13-g9d0acfe46

what I am looking after is mostly the container runtime.

Also I have found a similar issue (missing memory metrics) at No memory metrics from kubelet stats endpoint · Issue #103366 · kubernetes/kubernetes · GitHub.

By reading that issue, we should be able to narrow it down by looking at the output of the following command. Can you please run the following command in your cluster and provide the output here? this is what I get in my cluster. more info at No memory metrics from kubelet stats endpoint · Issue #103366 · kubernetes/kubernetes · GitHub

root@multi-v1:/# cat /var/lib/kubelet/config.yaml  | grep cgroup

cgroupDriver: cgroupfs
cgroupRoot: /kubelet

If the output of that command is cgroupDriver: systemd that might be the reason of the missing metrics.

We not use cloud provider for Kubernetes.
Here are the data from the cmd:
Debian GNU/Linux 11 (bullseye) 5.10.0-19-cloud-amd64 containerd://1.4.9
or
Debian GNU/Linux 11 (bullseye) 5.10.0-19-cloud-amd64 containerd://1.5.8

And we use systemd

cat /var/lib/kubelet/config.yaml  | grep cgroup
cgroupDriver: systemd

We have to ugrade our Kuberneter 1.22.2 to at least 1.22.9