I have this production cluster that is running dockerized versions of ELK applications. One VM hosts the entire stack except for filebeat instances that run one in each VM where other applications run.
Sometimes the containers that host filebeat get killed by the coordinator, and after investigating it is because they run out of memory. All linux
When checking all the filebeats I noticed that out of the 8 instances 5 have betwen 100 and 300 handlers open, while the other 3 have over 1500 handlers open.
By checking monitoring data before the instances got killed I noticed that they keep increasing handlers and memory until they use all the container memory (512Mb) and it gets killed.
The the applications we are harvesting logs from, use regular log rotation with about 2mb per file and have different amounts of traffic. But I check the servers filesystem and while filebeat has around 1500 handlers open I only see 250 log files.
The servers where the high handlers are happening are particularly high on traffic, so I imagine is possible that although the logs are rotating and being rolled over, filebeat is keeping the handlers open to keep reading from them but I need a way to be sure of what is going on to know if I need to increase memory or if I have a missconfiguration.
For example, from the monitor tab in kibana, both logstash and elasticsearch seem to be ingesting logs just fine, under 5ms per event with around 180 e/s but I don´t know how to be sure where the bottleneck is if there is any.
Thanks in advance!
Here is my filebeat config:
filebeat.autodiscover: providers: - type: docker templates: - condition: contains: docker.container.labels.elkEnabled: "true" config: - type: docker containers.ids: - "${data.docker.container.id}" multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}' multiline.negate: true multiline.match: after processors: - add_docker_metadata: ~ queue: mem: events: 12800 flush.min_events: 3200 output.logstash: enabled: true hosts: '${LOGSTASH_HOSTS}' bulk_max_size: 3200 worker: 2 xpack.monitoring.enabled: true xpack.monitoring.elasticsearch: hosts: '${ELASTICSEARCH_HOSTS}' logging.to_stderr: true