Hello,
I'm running Filebeat & ELK 5.4.1 on CentOS 7. The server where Filebeat runs hosts 2 applications whose logs get rotated weekly. Filebeat is setup to close an inactive file after 7 minutes and to remove a file from it's registry if the file is deleted. Neither the close or the removal happens, based on Filebeat's debug log & the contents of the registry. This happens on about 50% of application servers, whose Filebeat- & server setup is just about identical.
As far as I can deduce Filebeat does not recognize that a file has been inactive or closed as the debug log does do not close a file after it's inactivity timeout has been reached and, when that happens, none of the subsequent actions take place (remove from registry etc.).
o- Is there any way to 'force' Filebeat to update it's registry?
o- How do Filebeat determine that a file is inactive? (I am 100% sure that a file is not being updated yet Filebeat, on SOME servers, never does a 'close inactive').
I have seen mention in the Elastic manuals that Filebeat sometimes 'waits' for a filesystem that might have become unavailable to become available and that could be why it does not update it's registry. This is not the the case for me as the file(s) are on disks that are permanently attached to the server. Also - I think catering for filesystems that are available intermittendly should be a feature that could be disabled via a config option as surely it is the exception rather than the rule (apologies if such an option is available).
The logrotate occurred at around 11:06 in the Filebeat log (below)) - please note that the logs is for 2 application log files. I do see this message 'File rename detected but harvester not finished yet' AFTER the rotate/rename has happened.
Below Filebeat's debug log, config file & logrotate config file.
Filebeat.yml
filebeat.prospectors:
- input_type: log
paths:
- /ApplicationOnePath/production.log*
document_type: ElasticIndex01
exclude_files: ['\.gz$']
close_inactive: 7m
clean_removed: true
close_removed: true
scan_frequency: 11s
- input_type: log
paths:
- /ApplicationTwoPath/production.log*
document_type: ElasticIndex02
exclude_files: ['\.gz$']
close_inactive: 7m
clean_removed: true
close_removed: true
scan_frequency: 11s
output.logstash:
hosts: ['10.0.202.32:5044']
loadbalance: false
worker: 2
logging.level: info
logging.to_files: true
logging.to_syslog: false
logging.files:
path: /var/log/filebeat/
name: filebeat.log
keepfiles: 4
Logrotate config
create
dateformat -%Y%m%d
dateext
rotate 1
/ApplicationOnePath/production.log {
create 664 AppUser AppUser
ifempty
missingok
nocompress
nocopy
nocopytruncate
nomail
su AppUser AppUser
}
/ApplicationTwoPath/production.log {
create 664 AppUser AppUser
ifempty
missingok
nocompress
nocopy
nocopytruncate
nomail
su AppUser AppUser
}
I have a filebeat debug log available but see no facility for me to attach a compressed version to this ticket. I have compared this log to that of servers where Filebeat functions as documented and can confirm that, for example, messages stating that a file is being closed due to being inactive do not occur in the non-working log. Please advise if you require further log info.
We can 'force' Filebeat's registry to be 'corrected' by deleting the registry while Filebeat is running but, as this is a undocumented 'hack', do not see that as an acceptable workaround on production servers. I have tried restarting Filebeat without deleting the registry without success.