Metricbeat stopping or getting killed 7.9.1

Hi

Metric beat stops sending the metrics to elastic. Looks like it is getting killed internally.

Logs as below
2020-10-06T08:02:36.529+0100 INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":11530,"time":{"ms":490}},"total":{"ticks":19150,"time":{"ms":784},"value":19150},"user":{"ticks":7620,"time":{"ms":294}}},"handles":{"limit":{"hard":16000,"soft":16000},"open":11},"info":{"ephemeral_id":"5d72ce05-f6ff-4075-9514-8c84b838ebbc","uptime":{"ms":781712}},"memstats":{"gc_next":21988048,"memory_alloc":11042136,"memory_total":2006061880,"rss":1052672},"runtime":{"goroutines":71}},"libbeat":{"config":{"module":{"running":3},"scans":3},"output":{"events":{"acked":65,"batches":3,"total":65}},"outputs":{"kafka":{"bytes_read":392,"bytes_write":27806}},"pipeline":{"clients":10,"events":{"active":0,"published":65,"total":65},"queue":{"acked":65}}},"metricbeat":{"system":{"cpu":{"events":3,"success":3},"filesystem":{"events":16,"success":16},"fsstat":{"events":1,"success":1},"load":{"events":3,"success":3},"memory":{"events":3,"success":3},"network":{"events":6,"success":6},"process":{"events":27,"success":27},"process_summary":{"events":3,"success":3},"socket_summary":{"events":3,"success":3}}},"system":{"load":{"1":0.02,"15":0.13,"5":0.09,"norm":{"1":0.0013,"15":0.0081,"5":0.0056}}}}}}
2020-10-06T08:03:06.529+0100 INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":12020,"time":{"ms":494}},"total":{"ticks":19960,"time":{"ms":812},"value":19960},"user":{"ticks":7940,"time":{"ms":318}}},"handles":{"limit":{"hard":16000,"soft":16000},"open":11},"info":{"ephemeral_id":"5d72ce05-f6ff-4075-9514-8c84b838ebbc","uptime":{"ms":811711}},"memstats":{"gc_next":17943760,"memory_alloc":16464696,"memory_total":2088625008,"rss":-1150976},"runtime":{"goroutines":71}},"libbeat":{"config":{"module":{"running":3},"scans":3},"output":{"events":{"acked":49,"batches":3,"total":49}},"outputs":{"kafka":{"bytes_read":336,"bytes_write":25192}},"pipeline":{"clients":10,"events":{"active":0,"published":49,"total":49},"queue":{"acked":49}}},"metricbeat":{"system":{"cpu":{"events":3,"success":3},"load":{"events":3,"success":3},"memory":{"events":3,"success":3},"network":{"events":6,"success":6},"process":{"events":28,"success":28},"process_summary":{"events":3,"success":3},"socket_summary":{"events":3,"success":3}}},"system":{"load":{"1":0.16,"15":0.14,"5":0.11,"norm":{"1":0.01,"15":0.0088,"5":0.0069}}}}}}
2020-10-06T08:03:06.792+0100 INFO cfgfile/reload.go:190 Dynamic config reloader stopped
2020-10-06T08:03:06.792+0100 INFO [reload] cfgfile/list.go:124 Stopping 3 runners ...
2020-10-06T08:03:06.862+0100 INFO [monitoring] log/log.go:153 Total non-zero metrics {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":12020,"time":{"ms":12029}},"total":{"ticks":19960,"time":{"ms":19975},"value":19960},"user":{"ticks":7940,"time":{"ms":7946}}},"handles":{"limit":{"hard":16000,"soft":16000},"open":11},"info":{"ephemeral_id":"5d72ce05-f6ff-4075-9514-8c84b838ebbc","uptime":{"ms":812044}},"memstats":{"gc_next":17943760,"memory_alloc":16786608,"memory_total":2088946920,"rss":88408064},"runtime":{"goroutines":27}},"libbeat":{"config":{"module":{"running":3,"starts":3},"reloads":1,"scans":70},"output":{"events":{"acked":1346,"batches":70,"total":1346},"type":"kafka"},"outputs":{"kafka":{"bytes_read":8412,"bytes_write":593765}},"pipeline":{"clients":0,"events":{"active":0,"published":1346,"retry":35,"total":1346},"queue":{"acked":1346}}},"system":{"cpu":{"cores":16},"load":{"1":0.16,"15":0.14,"5":0.11,"norm":{"1":0.01,"15":0.0088,"5":0.0069}}}}}}
2020-10-06T08:03:06.862+0100 INFO [monitoring] log/log.go:154 Uptime: 13m32.046111342s
2020-10-06T08:03:06.862+0100 INFO [monitoring] log/log.go:131 Stopping metrics logging.
2020-10-06T08:03:06.862+0100 INFO instance/beat.go:456 metricbeat stopped.

@Prashant_Achari, if you enable debug logging, do you see any information there on why it could be stopping? Also, have you noticed if there is a specific interval metricbeat stops, ex every hour or right after you start it up?

Duration of going down is not same. but does not crash during the start up.

Logs after enabling the debug

2020-10-07T07:33:22.209+0100 DEBUG [service] service/service.go:56 Received sighup, stopping
2020-10-07T07:33:22.210+0100 INFO cfgfile/reload.go:227 Dynamic config reloader stopped
2020-10-07T07:33:22.210+0100 INFO [reload] cfgfile/list.go:124 Stopping 3 runners ...
2020-10-07T07:33:22.210+0100 DEBUG [reload] cfgfile/list.go:135 Stopping runner: RunnerGroup{system [metricsets=1], system [metricsets=1], system [metricsets=1], system [metricsets=1], system [metricsets=1], system [metricsets=1], system [metricsets=1]}
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.210+0100 DEBUG [reload] cfgfile/list.go:135 Stopping runner: RunnerGroup{system [metricsets=1], system [metricsets=1]}
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.210+0100 DEBUG [reload] cfgfile/list.go:135 Stopping runner: RunnerGroup{system [metricsets=1]}
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=filesystem, host=]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=uptime, host=]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=cpu, host=]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.210+0100 DEBUG [reload] cfgfile/list.go:137 Stopped runner: RunnerGroup{system [metricsets=1]}
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=fsstat, host=]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.210+0100 DEBUG [reload] cfgfile/list.go:137 Stopped runner: RunnerGroup{system [metricsets=1], system [metricsets=1]}
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=load, host=]
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=memory, host=]
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=network, host=]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.210+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.210+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.211+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=process, host=]
2020-10-07T07:33:22.211+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.211+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=process_summary, host=]
2020-10-07T07:33:22.211+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:157 client: closing acker
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:162 client: done closing acker
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:164 client: unlink from queue
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:177 client: cancelled 0 events
2020-10-07T07:33:22.211+0100 DEBUG [publisher] pipeline/client.go:166 client: done unlink
2020-10-07T07:33:22.211+0100 DEBUG [module] module/wrapper.go:214 Stopped metricSetWrapper[module=system, name=socket_summary, host=]
2020-10-07T07:33:22.211+0100 DEBUG [module] module/wrapper.go:155 Stopped Wrapper[name=system, len(metricSetWrappers)=1]
2020-10-07T07:33:22.211+0100 DEBUG [reload] cfgfile/list.go:137 Stopped runner: RunnerGroup{system [metricsets=1], system [metricsets=1], system [metricsets=1], system [metricsets=1], system [metricsets=1], system [metricsets=1], system [metricsets=1]}
2020-10-07T07:33:22.213+0100 INFO [monitoring] log/log.go:153 Total non-zero metrics {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":47190,"time":{"ms":47195}},"total":{"ticks":88170,"time":{"ms":88177},"value":88170},"user":{"ticks":40980,"time":{"ms":40982}}},"handles":{"limit":{"hard":65535,"soft":65535},"open":13},"info":{"ephemeral_id":"e7df2ece-af5d-47a4-b47b-87ef8d87d392","uptime":{"ms":3714887}},"memstats":{"gc_next":24599792,"memory_alloc":18565976,"memory_total":8491573464,"rss":65003520},"runtime":{"goroutines":27}},"libbeat":{"config":{"module":{"running":3,"starts":3},"reloads":1,"scans":1},"output":{"events":{"acked":6271,"batches":360,"total":6271},"type":"kafka"},"outputs":{"kafka":{"bytes_read":41700,"bytes_write":2286996}},"pipeline":{"clients":0,"events":{"active":25,"published":6296,"retry":35,"total":6296},"queue":{"acked":6271}}},"metricbeat":{"system":{"cpu":{"events":361,"success":361},"filesystem":{"events":976,"success":976},"fsstat":{"events":61,"success":61},"load":{"events":361,"success":361},"memory":{"events":361,"success":361},"network":{"events":722,"success":722},"process":{"events":2727,"success":2727},"process_summary":{"events":361,"success":361},"socket_summary":{"events":361,"success":361},"uptime":{"events":5,"success":5}}},"system":{"cpu":{"cores":16},"load":{"1":0.36,"15":0.24,"5":0.27,"norm":{"1":0.0225,"15":0.015,"5":0.0169}}}}}}
2020-10-07T07:33:22.213+0100 INFO [monitoring] log/log.go:154 Uptime: 1h1m54.88841668s
2020-10-07T07:33:22.213+0100 INFO [monitoring] log/log.go:131 Stopping metrics logging.
2020-10-07T07:33:22.213+0100 INFO instance/beat.go:456 metricbeat stopped.