Filebeat keeps files open uses up disk space

Hello,

I am trying to figure out why Filebeat keeps incoming log files open. When Logstash gets full, Filebeat will keep files open thus using up disk space and eventually eating it up. I would like to find a way to have it terminate the Logstash operation and ensure that the files are closed. Partial data loss may happen, which is a non-issue.

Thank you for reading!

Filebeat keeps input files open, because it is waiting for an ACK from Logstash to acknowledge that the events were sent. The input files are open until EOF is reached, events are acknowledged and the states in the registry file is updated. To avoid keeping files open use close_timeout in your input configuration.

See more: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#filebeat-input-log-close-timeout

How can I check if close_timeout is working? The debug log has too many messages so I can't see it happen, I set close_timeout to 1m to be able to see it, but I still can't find the event. The "info" setting for logs does not show it.

How do I find the close_timeout in the log files?

For eg: "Closing harvester because close_timeout was reached:" or something like that?. I must confirm it works.

Excatly. "Closing harvester because close_timeout was reached" is the line you are looking for.

Strange thing is, that even after adding close_timeout: 5m to the last line of the config file (" /etc/filebeat/filebeat.yml"), the timeout never happens and the same issues continue to happen :confused:

Do I have to add the line to filebeat.full.yml AND OR filebeat.reference.yml ?

My filebeat config (removed personal domains and etc...):

#=========================== Filebeat prospectors =============================
filebeat.prospectors:

  • type: log

    enabled: true

    paths:

    • /xxx/xxx/xxx
      document_type: syslog

#============================= Filebeat modules ===============================

filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: true

#----------------------------- Logstash output --------------------------------
output.logstash:
hosts: ["host.com:5043"]

#=========================== Harvester closing options ========================
close_renamed: true
close_timeout: 5m

close_timeout is an option of prospectors, so you need to add it to filebeat.prospectors otherwise it is not working. Your structure of config is half correct. It is true that close_timeout is a harvester config, but harvesters are the "worker theards" of prospectors, thus you configure them in the prospectors section. So each harvester of a prospector behaves the same and different prospectors can have different harvester settings.

#=========================== Filebeat prospectors =============================
filebeat.prospectors:

    type: log
  
    enabled: true

    paths:
     - /xxx/xxx/xxx
    document_type: syslog
    close_renamed: true
    close_timeout: 5m

#============================= Filebeat modules ===============================

filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: true

#================================ Outputs=====================================

#----------------------------- Logstash output --------------------------------
output.logstash:
hosts: ["host.com:5043"]

#=========================== Harvester closing options ========================

Thanks! that was probably the correct solution! Will update the thread if this worked!

P.S: Does "Closing harvester because close_timeout was reached" in the logs, if I set log type to "info"?

Yes, it shows up at INFO level.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.