MetricBeat v7.3.1 memory leak

Milan_Todorovic · October 29, 2019, 11:12am

Hi,
We are using ELK 7.3.1 installed on Kubernetes cluster. We are using official Docker image for Metricbeat in this case that is: docker.elastic.co/beats/metricbeat-oss:7.3.1

Ever since migration to this new version, Metricbeat is constantly showing increase in memory usage, without releasing memory resources. Once it reaches limit it is restarted by Kubernetes. Does anyone else has this problem and how would it be possible to resolve this?
Thanks

Mario_Castro · October 29, 2019, 3:25pm

Hi @Milan_Todorovic

Can you share the configuration you were using and the modules that were activated? There are no differences in the code on the OSS version, only some modules are not included, which should not impact performance.

Milan_Todorovic · October 30, 2019, 9:16am

For processor, we are using add_docker_metadata.
We also set cleanup_timeout: 1s.
For modules, we are using: logstash, elasticsearch, jolokia, docker and kubernetes modules, all sending data in interval of 30s.
Jolokia module is under autodiscovery like:
metricbeat.autodiscover:
providers:
- type: kubernetes
...
Thx for any idea.

Mario_Castro · October 30, 2019, 9:41am

Some logs will be useful. Also more information about how much is this memory limit. Modules like Jolokia can take a huge amount of memory depending of how many metrics you have configured. The same with the Docker module (it's dependent on the amount of containers).

Can you provide some metrics about this? I'm not sure if the cleanup_timeout parameter is too agressive. Have you tried rising it a bit to 10 seconds for example? Just to check of the memory behaves like this.

By the way, have you tried, just to try, to use the Metricbeat shipped with the basic license. Just to check if that solves the problem (it shouldn't, but just in case)

Milan_Todorovic · October 30, 2019, 10:15am

Logs does not contains any unusual information out of ordinary. Also this "memory leak" is not sudden jump. It takes days to reach limits of our defined memory. Upper limit in this case is 400MB. On our previous version of 6.8 we did not experience this problem and metricbeat used 200MB constantly. Problem only emerged after migration to version 7.3.1.
Also we have small amount of container that really can't overrun metricbeat.

Milan_Todorovic · October 30, 2019, 10:18am

2019-10-30T01:41:01.195-0700 2019-10-30T01:41:01.197-0700 2019-10-30T01:41:01.494-0700 2019-10-30T01:41:01.495-0700 2019-10-30T01:41:01.495-0700 2019-10-30T01:41:01.495-0700 2019-10-30T01:41:01.503-0700 2019-10-30T01:41:01.503-0700 2019-10-30T01:41:01.503-0700 2019-10-30T01:41:01.503-0700 2019-10-30T01:41:01.504-0700 2019-10-30T01:41:01.504-0700 2019-10-30T01:41:01.505-0700 2019-10-30T01:41:01.610-0700 2019-10-30T01:41:01.611-0700 2019-10-30T01:41:01.626-0700 2019-10-30T01:41:01.628-0700 2019-10-30T01:41:01.629-0700 2019-10-30T01:41:01.699-0700 2019-10-30T01:41:01.699-0700 2019-10-30T01:41:01.699-0700 2019-10-30T01:41:01.699-0700 2019-10-30T01:41:01.700-0700 2019-10-30T01:41:02.199-0700 2019-10-30T01:41:02.199-0700 2019-10-30T01:41:02.796-0700 2019-10-30T01:41:02.802-0700 2019-10-30T01:41:03.002-0700 2019-10-30T01:41:03.919-0700 2019-10-30T01:41:04.264-0700 2019-10-30T01:41:04.264-0700 2019-10-30T01:41:04.265-0700 2019-10-30T01:41:05.687-0700 2019-10-30T01:41:05.693-0700 2019-10-30T01:41:05.694-0700 2019-10-30T01:41:06.739-0700 2019-10-30T01:41:06.745-0700 2019-10-30T01:41:06.746-0700 2019-10-30T01:41:10.563-0700 2019-10-30T01:41:10.578-0700 2019-10-30T01:41:10.578-0700 2019-10-30T01:41:31.705-0700 INFO instance/beat.go:606 Home path: [/usr/share/metricbeat] Config path: [/usr/share/metricbeat] Data path: [/usr/share/metricbeat/data] Logs path: [/usr/share/metricbeat/logs]
INFO instance/beat.go:614 Beat ID: b041c16f-b921-4375-b7ac-ee950472db40
INFO [seccomp] seccomp/seccomp.go:124 Syscall filter successfully installed
INFO [beat] instance/beat.go:902 Beat info {"system_info": {"beat": {"path": {"config": "/usr/share/metricbeat", "data": "/usr/share/metricbeat/data", "home": "/usr/share/metricbeat", "logs": "/usr/share/metricbeat/logs"}, "type": "metricbeat", "uuid": "b041c16f-b921-4375-b7ac-ee950472db40"}}}
INFO [beat] instance/beat.go:911 Build info {"system_info": {"build": {"commit": "a4be71b90ce3e3b8213b616adfcd9e455513da45", "libbeat": "7.3.1", "time": "2019-08-19T19:20:02.000Z", "version": "7.3.1"}}}
INFO [beat] instance/beat.go:914 Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":88,"version":"go1.12.4"}}}
INFO [beat] instance/beat.go:918 Host info {"system_info": {"host": {"architecture":"x86_64","boot_time":"2019-05-20T02:24:20-07:00","containerized":false,"name":"metricbeat-mbpm4","ip":["127.0.0.1/8","10.244.5.192/24"],"kernel_version":"4.19.15-1.1.el7.x86_64","mac":["0a:58:0a:f4:05:c0"],"os":{"family":"redhat","platform":"centos","name":"CentOS Linux","version":"7 (Core)","major":7,"minor":6,"patch":1810,"codename":"Core"},"timezone":"PDT","timezone_offset_sec":-25200}}}
INFO [beat] instance/beat.go:947 Process info {"system_info": {"process": {"capabilities": {"inheritable":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"permitted":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"effective":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"bounding":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"ambient":null}, "cwd": "/usr/share/metricbeat", "exe": "/usr/share/metricbeat/metricbeat", "name": "metricbeat", "pid": 1, "ppid": 0, "seccomp": {"mode":"filter","no_new_privs":true}, "start_time": "2019-10-30T01:41:00.140-0700"}}}
INFO instance/beat.go:292 Setup Beat: metricbeat; Version: 7.3.1
INFO [index-management] idxmgmt/std.go:178 Set output.elasticsearch.index to 'metricbeat-7.3.1' as ILM is enabled.
INFO elasticsearch/client.go:170 Elasticsearch url: http://elasticsearch:9200
INFO [publisher] pipeline/module.go:97 Beat name: metricbeat-mbpm4
INFO kubernetes/util.go:86 kubernetes: Using pod name metricbeat-mbpm4 and namespace default to discover kubernetes node
INFO kubernetes/util.go:93 kubernetes: Using node <our_node> discovered by in cluster pod node query
INFO kubernetes/util.go:86 kubernetes: Using pod name metricbeat-mbpm4 and namespace default to discover kubernetes node
INFO kubernetes/util.go:93 kubernetes: Using node <our_node> discovered by in cluster pod node query
WARN [cfgwarn] kubernetes/kubernetes.go:55 BETA: The kubernetes autodiscover is beta
INFO kubernetes/util.go:86 kubernetes: Using pod name metricbeat-mbpm4 and namespace default to discover kubernetes node
INFO kubernetes/util.go:93 kubernetes: Using node <our_node> discovered by in cluster pod node query
INFO [monitoring] log/log.go:118 Starting metrics logging every 30s
INFO instance/beat.go:421 metricbeat start running.
INFO [autodiscover] autodiscover/autodiscover.go:105 Starting autodiscover manager
INFO kubernetes/watcher.go:182 kubernetes: Performing a resource sync for *v1.PodList
INFO kubernetes/watcher.go:198 kubernetes: Resource sync done
INFO kubernetes/watcher.go:242 kubernetes: Watching API for resource events
INFO pipeline/output.go:95 Connecting to backoff(elasticsearch(http://elasticsearch:9200))
INFO elasticsearch/client.go:743 Attempting to connect to Elasticsearch version 7.3.1
INFO template/load.go:169 Existing template will be overwritten, as overwrite is enabled.
INFO template/load.go:108 Try loading template metricbeat-7.3.1 to Elasticsearch
INFO template/load.go:100 template with name 'metricbeat-7.3.1' loaded.
INFO [index-management] idxmgmt/std.go:289 Loaded index template.
INFO pipeline/output.go:105 Connection to backoff(elasticsearch(http://elasticsearch:9200)) established
INFO kubernetes/watcher.go:182 kubernetes: Performing a resource sync for *v1.PodList
INFO kubernetes/watcher.go:198 kubernetes: Resource sync done
INFO kubernetes/watcher.go:242 kubernetes: Watching API for resource events
INFO kubernetes/watcher.go:182 kubernetes: Performing a resource sync for *v1.PodList
INFO kubernetes/watcher.go:198 kubernetes: Resource sync done
INFO kubernetes/watcher.go:242 kubernetes: Watching API for resource events
INFO kubernetes/watcher.go:182 kubernetes: Performing a resource sync for *v1.NodeList
INFO kubernetes/watcher.go:198 kubernetes: Resource sync done
INFO kubernetes/watcher.go:242 kubernetes: Watching API for resource events
INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s

Mario_Castro · October 31, 2019, 9:59am

Can you try upgrading to 7.4? It seems it was a known issue and it's solved now:

Milan_Todorovic · November 2, 2019, 8:52pm

From the link you provided me with, it seams that this is planed fix for metricbeat 7.5, and not 7.4 version. Am I correct on this? Thanks

Mario_Castro · November 12, 2019, 11:42am

7.4.2, 7.3.3 and 7.5.0 yes

system · December 10, 2019, 11:42am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Memory usage is at extreme high level Beats metricbeat	10	1525	October 9, 2019
Metricbeat autodiscovery kubernetes Beats metricbeat	22	1904	October 30, 2018
CPU/Memory consumption after upgrading to Version 5.1.2 Beats metricbeat	9	1361	February 28, 2017
No metrics cpu, memory for kubernetes pods Beats docker , metricbeat	3	792	January 4, 2020
Metricbeat produces CPU usage spikes every 10 seconds across different releases Beats docker , metricbeat	1	397	March 30, 2022

MetricBeat v7.3.1 memory leak

Related topics