Memory metrics not correct

Hi,

I am using Metrics with docker but the numbers seem to be incorrect.

Docker stats:

Memory in Metrics:
Screen Shot 2021-01-20 at 09.28.43

My docker-compose:

  app:
    ...
  monitor:
    container_name: monitor
    build:
      context: ../metricbeat
    environment:
      - appName=$appName
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /proc:/hostfs/proc:ro
      - /sys/fs/cgroup:/hostfs/sys/fs/cgroup:ro
      - /:/hostfs:ro

metricbeat Dockerfile

FROM docker.elastic.co/beats/metricbeat:7.10.2

COPY ./metricbeat.yml /usr/share/metricbeat/metricbeat.yml

metricbeat.yml

name: "${appName}"
tags: ["dart-app"]

metricbeat.config.modules:
  path: /usr/share/metricbeat/modules.d/*.yml
  reload.enabled: true

processors:
  - add_cloud_metadata:

output.elasticsearch:
  hosts: [ "xxx" ]

setup.kibana:
  host: "xxx"

Hi @jodinathan,

it's possible that the docker cli and the Metrics UI look at different time intervals and calculate the averages differently. Can you compare the change of the values over time from both tools to see if that shows any correlation? Does the node details page show the same deviating values?

Hi @weltenwort,

I am not using node, it is a Dart application. By looking at the Observatory page, the memory usage is close to the Docker stats.
I analyzed a range metric and it is the same behavior.

Digging the Kibana dashboard, it seems that the percentage is used + cached memory and that may be the reason for the discrepancy.

what do you think?

I am not using node, it is a Dart application.

Sorry for using ambiguous terms here. With "node" I meant the box on the inventory screen. It might be a host, a container or a pod depending on the selected view.

it seems that the percentage is used + cached memory

Your hypothesis sounds plausible. The metricbeat module collects the stats via the docker API for which I found the following note in the docker stats docs:

On Linux, the Docker CLI reports memory usage by subtracting cache usage from the total memory usage. The API does not perform such a calculation but rather provides the total memory usage and the amount from the cache so that clients can use the data as needed. The cache usage is defined as the value of total_inactive_file field in the memory.stat file on cgroup v1 hosts.

On Docker 19.03 and older, the cache usage was defined as the value of cache field. On cgroup v2 hosts, the cache usage is defined as the value of inactive_file field.

(source)

So docker stats performs some custom post-processing of the data to arrive at these values, which likely don't match that of many monitoring tools.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.