We are using filebeat to filter and push postfix logs to logstash, we have installed a customized postfix module and enabled it in filebeat.
# Module: postfix
- module: postfix
mail:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths: ${PLT_EMAIL_LOG_PATHS:[/var/log/mail-*.log]}
input:
ignore_older: 72h
clean_inactive: 74h
And we have Splunk monitoring the filebeat logs that should print the metrics log every 30 seconds. But somehow we were notified that the metrics log was not printed anymore a day ago, and actually, there is no log in the log files anymore. The filebeat process looks like hanging even though the 'service status' shows the filebeat is running(Green).
And we cannot run diagnose command against it as well. For example:
- lsof -Pan -p no response in long time
- gcore no response in long time as well
We cannot use dlv because there is no debug info in the filebeat.
We examine the fd used by filebeat, and found some links to removed files(red) .
Does anyone know anything why it hangs?