I am trying to figure out why Filebeat keeps incoming log files open. When Logstash gets full, Filebeat will keep files open thus using up disk space and eventually eating it up. I would like to find a way to have it terminate the Logstash operation and ensure that the files are closed. Partial data loss may happen, which is a non-issue.
Filebeat keeps input files open, because it is waiting for an ACK from Logstash to acknowledge that the events were sent. The input files are open until EOF is reached, events are acknowledged and the states in the registry file is updated. To avoid keeping files open use close_timeout in your input configuration.
How can I check if close_timeout is working? The debug log has too many messages so I can't see it happen, I set close_timeout to 1m to be able to see it, but I still can't find the event. The "info" setting for logs does not show it.
How do I find the close_timeout in the log files?
For eg: "Closing harvester because close_timeout was reached:" or something like that?. I must confirm it works.
Strange thing is, that even after adding close_timeout: 5m to the last line of the config file (" /etc/filebeat/filebeat.yml"), the timeout never happens and the same issues continue to happen
Do I have to add the line to filebeat.full.yml AND OR filebeat.reference.yml ?
My filebeat config (removed personal domains and etc...):
close_timeout is an option of prospectors, so you need to add it to filebeat.prospectors otherwise it is not working. Your structure of config is half correct. It is true that close_timeout is a harvester config, but harvesters are the "worker theards" of prospectors, thus you configure them in the prospectors section. So each harvester of a prospector behaves the same and different prospectors can have different harvester settings.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.