I tried to reproduce on dev VM with following steps:
- configured filebeat:
- to sent do unavailable redis output
- to read some gigabytes of local logfiles
- start filebeat
- check free disk space
- check lsof of filebeat
- delete logfiles
- check lsof of filebeat
- check free disk space
Reproducing is easy:
lsof shows stuff like this (after I deleted the files):
filebeat 1795 filebeat 6r REG 253,0 2382651371 655368 /plx/jboss/test7/instance-poliks/log/monitoring.json.log.2019-11-16 (deleted)
filebeat 1795 filebeat 7r REG 253,0 2521340487 655369 /plx/jboss/test7/instance-poliks/log/monitoring.json.log.2019-11-17 (deleted)
filebeat 1795 filebeat 8r REG 253,0 1364593607 655367 /plx/jboss/test7/instance-poliks/log/monitoring.json.log (deleted)
filebeat 1795 1796 filebeat 6r REG 253,0 2382651371 655368 /plx/jboss/test7/instance-poliks/log/monitoring.json.log.2019-11-16 (deleted)
filebeat 1795 1796 filebeat 7r REG 253,0 2521340487 655369 /plx/jboss/test7/instance-poliks/log/monitoring.json.log.2019-11-17 (deleted)
filebeat 1795 1796 filebeat 8r REG 253,0 1364593607 655367 /plx/jboss/test7/instance-poliks/log/monitoring.json.log (deleted)
filebeat 1795 1797 filebeat 6r REG 253,0 2382651371 655368 /plx/jboss/test7/instance-poliks/log/monitoring.json.log.2019-11-16 (deleted)
filebeat 1795 1797 filebeat 7r REG 253,0 2521340487 655369 /plx/jboss/test7/instance-poliks/log/monitoring.json.log.2019-11-17 (deleted)
df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 15G 7.4G 6.6G 53% /
I changed my input config to this (added close_removed: true at the end) and retried :
[filebeat@filebeat ~]$ cat /etc/filebeat/inputs.d/generic_json.yml
# json non root logs
- type: log
enabled: true
paths:
- /plx/jboss/*/*/log/monitoring.json.log*
encoding: windows-1252
fields:
logType: generic-json
log.format: json
fields_under_root: true
max_bytes: 90000000
#ignore_older: 0
#scan_frequency: 1s
close_removed: true
But the issue stays. Now for 15 minutes lsof shows that the deleted files are still opened by filebeat.
What is the issue? What can I do?
filebeat log shows:
2020-03-24T11:00:23.657+0100 INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":1820,"time":{"ms":13}},"total":{"ticks":6660,"time":{"ms":28},"value":6660},"user":{"ticks":4840,"time":{"ms":15}}},"handles":{"limit":{"hard":4096,"soft":1024},"open":9},"info":{"ephemeral_id":"b94ae394-89ad-46c7-9737-4b9ce9052668","uptime":{"ms":1650170}},"memstats":{"gc_next":84354352,"memory_alloc":43153184,"memory_total":303585624},"runtime":{"goroutines":64}},"filebeat":{"harvester":{"open_files":3,"running":3}},"libbeat":{"config":{"module":{"running":0},"reloads":1},"pipeline":{"clients":6,"events":{"active":4117}}},"registrar":{"states":{"current":6}},"system":{"load":{"1":0.57,"15":0.42,"5":0.58,"norm":{"1":0.19,"15":0.14,"5":0.1933}}}}}}
2020-03-24T11:00:26.663+0100 ERROR pipeline/output.go:100 Failed to connect to redis(tcp://elastic-redis.internal.de:16380): dial tcp 172.22.19.164:16380: i/o timeout
2020-03-24T11:00:26.663+0100 INFO pipeline/output.go:93 Attempting to reconnect to redis(tcp://elastic-redis.internal.de:16380) with 36 reconnect attempt(s)
2020-03-24T11:00:53.674+0100 INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":1850,"time":{"ms":33}},"total":{"ticks":6710,"time":{"ms":52},"value":6710},"user":{"ticks":4860,"time":{"ms":19}}},"handles":{"limit":{"hard":4096,"soft":1024},"open":9},"info":{"ephemeral_id":"b94ae394-89ad-46c7-9737-4b9ce9052668","uptime":{"ms":1680171}},"memstats":{"gc_next":84354352,"memory_alloc":44123328,"memory_total":304555768,"rss":-24576},"runtime":{"goroutines":64}},"filebeat":{"harvester":{"open_files":3,"running":3}},"libbeat":{"config":{"module":{"running":0},"reloads":1},"pipeline":{"clients":6,"events":{"active":4117,"retry":2048}}},"registrar":{"states":{"current":6}},"system":{"load":{"1":0.6,"15":0.43,"5":0.59,"norm":{"1":0.2,"15":0.1433,"5":0.1967}}}}}}
So what is the issue? What can I do that filebeat closes the file handle if file is removed?
PS: filebeat is running as unprivileged user, but when I login to filebeat and run lsof, it can see it's own opened files with the deleted marking.