Agent OS: Amazon Linux 2 AMI 2.0.20210326.0 x86_64 HVM gp2
MongoDB Version: 4.0.19
Metricbeat Version: 6.6.0
I have the following configuration for the module:
- hosts: ['<internal hostname>:27017'] metricsets: [dbstats, status, metrics] module: mongodb period: 60s - hosts: ['<internal hostname>:27017'] metricsets: [replstatus] module: mongodb period: 10s
This is part of a 3-node replica set, and is a secondary node. It has about 2TB of data stored.
Running into an issue where metricbeat is killing nodes because they're running out of memory.
What I can see is its
listDatabases calls taking about 25 minutes to complete, but the metricbeat agent will continue to create and query at every period tick. The nodes end up with a ton of open connections from metricbeat until eventually it runs out of memory and falls over.
I couldn't see any related fixes in any release notes, is there any way to set a query timeout within metricbeat to protect against this?
A big bottleneck in this case is the limited amount of memory on the instance, but protection against long-running queries would be a great thing to have!