Docker memory metrics seems to be incorrect

I installed metricbeat on my Ubuntu 20.04 VM, which has Docker 19.03.9 running.
By looking at the [Metricbeat Docker] Overview ECS dashboar, it looks like one of the container's memory usage is growing / leaking:

But checking that container on the host (docker stats), reports, that it's memory usage actually much lower - only ~24MB:

CONTAINER ID        NAME                               CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
2d4b96b675e2        vhost-noty.propovednik.com         0.03%               46.06MiB / 15.64GiB   0.29%               108MB / 10.2GB      5.57GB / 0B         5
c3cad6f15083        vhost-slavikf.com                  0.04%               27.04MiB / 15.64GiB   0.17%               609MB / 185MB       25.7MB / 0B         5
..

So, where the metricbeat get such high memory usage number for that container?

Here is event example:

"docker": {
      "memory": {
        "limit": 16792281088,
        "rss": {
          "total": 24829952,
          "pct": 0.0014786527137009297
        },
        "usage": {
          "max": 3837181952,
          "total": 3819274240,
          "pct": 0.22744225278180383
        },
        "stats": {
          "active_file": 465588224,
          "cache": 3770892288,
          "rss": 24829952,
          "total_mapped_file": 3108864,
          "total_pgpgin": 6793083,
          "dirty": 0,
          "inactive_anon": 135168,
          "total_pgfault": 7708998,
          "writeback": 0,
          "hierarchical_memory_limit": 9223372036854772000,
          "total_dirty": 0,
          "total_inactive_anon": 135168,
          "hierarchical_memsw_limit": 0,
          "pgfault": 7708998,
          "pgpgin": 6793083,
          "rss_huge": 6291456,
          "pgmajfault": 0,
          "total_pgpgout": 5867892,
          "total_writeback": 0,
          "total_rss": 24829952,
          "inactive_file": 3302236160,
          "total_active_anon": 27488256,
          "total_inactive_file": 3302236160,
          "total_pgmajfault": 0,
          "total_cache": 3770892288,
          "total_rss_huge": 6291456,
          "total_unevictable": 0,
          "unevictable": 0,
          "active_anon": 27488256,
          "mapped_file": 3108864,
          "pgpgout": 5867892,
          "total_active_file": 465588224
        },
        "fail": {
          "count": 0
        }
      },

That's rather unusual. Can you give me the raw stats object from docker?
You'll need access to the docker management node:

Get the ID of the container:

docker inspect vhost-noty.propovednik.com | grep Id

Then use that ID to get the raw stats object:

curl -H "Content-Type: application/json" -XGET --unix-socket /var/run/docker.sock 'http://v1.40/containers/CONTAINER_ID/stats?stream=false'

I'm assuming this is on Linux. You'll need root for that command. Depending on what docker version you have, you may need to replace the 1.40 with the API version listed by docker version

    slavik@ub20azure:~$ sudo docker inspect vhost-noty.propovednik.com | grep Id
            "Id": "2d4b96b675e2d9c0eaba9334f98d8a2ec2e51de8d17cd7ccfd73c8a2eaa3ebc2",

    slavik@ub20azure:~$ sudo curl -H "Content-Type: application/json" -XGET --unix-socket /var/run/docker.sock 'http://v1.40/containers/2d4b96b675e2d9c0eaba9334f98d8a2ec2e51de8d17cd7ccfd73c8a2eaa3ebc2/stats?stream=false'
    {"read":"2020-06-01T18:21:52.107656102Z","preread":"2020-06-01T18:21:51.105629564Z","pids_stats":{"current":5},"blkio_stats":{"io_service_bytes_recursive":[{"major":8,"minor":32,"op":"Read","value":675840},{"major":8,"minor":32,"op":"Write","value":0},{"major":8,"minor":32,"op":"Sync","value":675840},{"major":8,"minor":32,"op":"Async","value":0},{"major":8,"minor":32,"op":"Discard","value":0},{"major":8,"minor":32,"op":"Total","value":675840},{"major":8,"minor":0,"op":"Read","value":5803532288},{"major":8,"minor":0,"op":"Write","value":0},{"major":8,"minor":0,"op":"Sync","value":5803532288},{"major":8,"minor":0,"op":"Async","value":0},{"major":8,"minor":0,"op":"Discard","value":0},{"major":8,"minor":0,"op":"Total","value":5803532288}],"io_serviced_recursive":[{"major":8,"minor":32,"op":"Read","value":27},{"major":8,"minor":32,"op":"Write","value":0},{"major":8,"minor":32,"op":"Sync","value":27},{"major":8,"minor":32,"op":"Async","value":0},{"major":8,"minor":32,"op":"Discard","value":0},{"major":8,"minor":32,"op":"Total","value":27},{"major":8,"minor":0,"op":"Read","value":55198},{"major":8,"minor":0,"op":"Write","value":0},{"major":8,"minor":0,"op":"Sync","value":55198},{"major":8,"minor":0,"op":"Async","value":0},{"major":8,"minor":0,"op":"Discard","value":0},{"major":8,"minor":0,"op":"Total","value":55198}],"io_queue_recursive":[],"io_service_time_recursive":[],"io_wait_time_recursive":[],"io_merged_recursive":[],"io_time_recursive":[],"sectors_recursive":[]},"num_procs":0,"storage_stats":{},"cpu_stats":{"cpu_usage":{"total_usage":2122436017057,"percpu_usage":[1063502832973,1058933184084],"usage_in_kernelmode":160230000000,"usage_in_usermode":1916320000000},"system_cpu_usage":1137519360000000,"online_cpus":2,"throttling_data":{"periods":0,"throttled_periods":0,"throttled_time":0}},"precpu_stats":{"cpu_usage":{"total_usage":2122435718856,"percpu_usage":[1063502588972,1058933129884],"usage_in_kernelmode":160230000000,"usage_in_usermode":1916320000000},"system_cpu_usage":1137517380000000,"online_cpus":2,"throttling_data":{"periods":0,"throttled_periods":0,"throttled_time":0}},"memory_stats":{"usage":3898925056,"max_usage":3906424832,"stats":{"active_anon":27553792,"active_file":527630336,"cache":3847147520,"dirty":0,"hierarchical_memory_limit":9223372036854771712,"hierarchical_memsw_limit":0,"inactive_anon":135168,"inactive_file":3316396032,"mapped_file":3244032,"pgfault":9168522,"pgmajfault":0,"pgpgin":7858719,"pgpgout":6914968,"rss":24752128,"rss_huge":6291456,"total_active_anon":27553792,"total_active_file":527630336,"total_cache":3847147520,"total_dirty":0,"total_inactive_anon":135168,"total_inactive_file":3316396032,"total_mapped_file":3244032,"total_pgfault":9168522,"total_pgmajfault":0,"total_pgpgin":7858719,"total_pgpgout":6914968,"total_rss":24752128,"total_rss_huge":6291456,"total_unevictable":0,"total_writeback":0,"unevictable":0,"writeback":0},"limit":16792281088},"name":"/vhost-noty.propovednik.com","id":"2d4b96b675e2d9c0eaba9334f98d8a2ec2e51de8d17cd7ccfd73c8a2eaa3ebc2","networks":{"eth0":{"rx_bytes":120168812,"rx_packets":1052541,"rx_errors":0,"rx_dropped":0,"tx_bytes":11026650593,"tx_packets":1349365,"tx_errors":0,"tx_dropped":0}}}


    slavik@ub20azure:~$ sudo docker stats
    CONTAINER ID        NAME                               CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
    2d4b96b675e2        vhost-noty.propovednik.com         0.03%               49.43MiB / 15.64GiB   0.31%               120MB / 11GB        5.8GB / 0B          5
    ...

From the object you gave me:

   "memory_stats":{     
   "usage":3898925056,
      "max_usage":3906424832,
      "stats":{
         "active_anon":27553792,
         "active_file":527630336,
         "cache":3847147520,
         "dirty":0,
         "hierarchical_memory_limit":9223372036854771712,
         "hierarchical_memsw_limit":0,
         "inactive_anon":135168,
         "inactive_file":3316396032,
         "mapped_file":3244032,
         "pgfault":9168522,
         "pgmajfault":0,
         "pgpgin":7858719,
         "pgpgout":6914968,
         "rss":24752128,
         "rss_huge":6291456,
         "total_active_anon":27553792,
         "total_active_file":527630336,
         "total_cache":3847147520,
         "total_dirty":0,
         "total_inactive_anon":135168,
         "total_inactive_file":3316396032,
         "total_mapped_file":3244032,
         "total_pgfault":9168522,
         "total_pgmajfault":0,
         "total_pgpgin":7858719,
         "total_pgpgout":6914968,
         "total_rss":24752128,
         "total_rss_huge":6291456,
         "total_unevictable":0,
         "total_writeback":0,
         "unevictable":0,
         "writeback":0
      },
},

It appears that docker itself is reporting that nearly 4GB memory size: "usage":3898925056

This is the raw value that elasticsearch is reporting. However, after doing some digging, it looks that docker stats is reporting a different value:

float64(mem.Usage - mem.Stats["cache"])

Using the stats that you gave me, I get:

3906424832-3847147520=59277312
59277312 bytes = 56.53125MiB

Which is about what you got from docker stats.

I'm going to look into this more and see if we can or should report memory usage in a way like stats does.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.