Filebeat Not able to catch up with rotating container logs

elk_follower · February 11, 2022, 5:41am

We have a situation where container logs are rotated ~5 times with-in a minute and only last 5 files are kept before getting deleted.

We have Filebeat running as daemon set in the K8 cluster and it is able to send the logs for first few minutes and then it starts to lag to read logs from these files.
We have observed delay for about 12 hours and the filebeat memory is growing rapidly.

Also, observed that due to frequent log rotations the registry file is increasing and the number of open files keeps on increasing as well.

Below is snippet of filebeat logs

There are errors as well in filebeat logs which i believe is due to frequent log rotation.

I'm looking for filebeat configuration which i can use to read all these log files before being deleted.

Thanks in Advance.

kvch · February 11, 2022, 11:04am

Indeed, the errors are due to the quick log rotation. There are little things we can do about it in Filebeat, but the root cause of the problem is usually the fact that Elasticsearch cannot keep up with the constant flow of events from Filebeat. Please also look at your Elasticsearch instance and if there are any issues, fix them or give more resources to the instance.

Also, could you please share your Filebeat configuration, so we can fine-tune it for your use case?

elk_follower · February 11, 2022, 6:17pm

Hi,

Appreciate your fast response.

I don't see any contention in logstash or in Elasticsearch. I tried increasing the logstash pods and also increased the CPU limits for it. I also tried increasing the refresh_interval for the index and replica to zero so that ingestion is fast. But no luck so far.

Filebeat is able to send logs with decent rate for 5 to 10 minutes after restart but eventually it lags as the number of open files keeps on increasing. If Elasticsearch is causing the contention then I don't think filebeat will be able to send logs after restarts for 15 minutes. On some nodes where log rotation is not that fast its working fine.

Below is the filebeat configuration:

- type: container
  containers.ids:
  - "*"
  paths:
    - "/var/lib/docker/containers/*/*.log"
  multiline.pattern: '^\[|^{|^\(|^[t]=|^text|^ERROR|^INFO|^DEBUG|^level=|^[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}|^[0-9]{4}-[0-9]{2}-[0-9]{2}|^\[[0-9]{4}-[0-9]{2}-[0-9]{2}|^[0-9]{2}:[0-9]{2}:[0-9]{2}|^[0-9]{2,3}.[0-9]{2,3}.[0-9]{2,3}|^[a-zA-Z]{1}[0-9]{4}|^[a-zA-Z]{3,4}\s[a-zA-Z]{3}|^[a-zA-Z]{2,3}-[a-zA-Z]{2,5}-[0-9a-z]{2,7}'
  multiline.negate: true
  multiline.match: after
  clean_inactive: 61m          #Tried multiple values from hours to minutes 
  ignore_older: 60m            #Tried multiple values from hours to minutes 
  close_inactive: 1m            #Tried multiple values from few minutes to 10 seconds
  clean_removed: true       
  close_removed: true
  processors:
    - add_kubernetes_metadata:
        in_cluster: true
- type: log
  clean_removed: true
  paths:
    - /var/logs/*.log

output.logstash:
  hosts: ["${LOGSTASH_HOST:mon-logstash}:${LOGSTASH_PORT:5044}"] 
  pipelining: 4   #Tried default to 6
  worker: 6       #Again tried from default to 10 
  loadbalance: true

In addition to above i tried following filebeat configurations as well:

max_bulk_size
scan_frequency
filebeat.registry.flush
TTL in output.logash

Also, there are couple of thousands entries in filebeat registry file for the rotated log files as log files are renamed with same name.

Logstash 3 pods: CPU limit is 4

Elasticsearch 3 data pods: CPU limit 4:

I have few workarounds but want to know if you are aware of this kind of race conditions with filebeat.

Is there a way i can see if Logstash/Elastsearch is putting back pressure on filebeat?May be filebeat metrics?

Also, it seems logstash is not completely loadbalanced. I tried setting TTL but it didn't work as expected. May be if you can shed some light on that as well will be helpful.

Once again thank you and looking forward for your response.

system · March 11, 2022, 8:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat handles rotated files issue Beats filebeat	4	920	November 15, 2018
Filebeat Holds open rotated log files Beats filebeat	6	1255	August 13, 2018
Filebeat stops in docker container Beats docker , filebeat	3	826	May 1, 2020
Filebeat docker running on windows not allowing application to rotate the log! Beats filebeat	13	3893	July 6, 2017
Filebeat does not harvest all logs from application Beats docker , filebeat	1	667	April 18, 2022

Filebeat Not able to catch up with rotating container logs

Related topics