Hi there,
I've gor problem with a filebeat instance which ends to stop sending logs after smome time.
Here is my setup :
type: log
enabled: true
paths:
- /opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/*/stdOutErr_*.log
fields:
log_type: talend
talend_version: 7
include_lines: ['^\[TALEND\]']
exclude_lines: ['\][ ]*-[ ]*t','TalendJob','LoggingOutInterceptor','LoggingInInterceptor']
harvester_limit: 2000
close_inactive : 30m
ignore_older: 12h
clean_removed: true
output.logstash:
# The Logstash hosts
hosts: ["ls1.vrack:5047","ls2.vrack:5047"]
I parse talend logfiles, a new file is generated for each job , which is scheduled every 15mns , so every day we generate a lot of files . Once the job is ended, the file is never reused, so I addes close_inactive option.
This setup works fine for some time (ex 10/15 days), but after that it just stop to send logs to logstash.
For now, I have ~32000 files matching the pattern.
Below some logs when I restart filebeat with filebeat -e -d "*"
2020-12-01T17:34:26.481+0100 INFO instance/beat.go:645 Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat]
2020-12-01T17:34:26.481+0100 DEBUG [beat] instance/beat.go:697 Beat metadata path: /var/lib/filebeat/meta.json
2020-12-01T17:34:26.482+0100 INFO instance/beat.go:653 Beat ID: 0aaa1256-9d80-49a6-883d-581e6774bbf2
2020-12-01T17:34:29.483+0100 DEBUG [processors] processors/processor.go:120 Generated new processors: add_host_metadata=[netinfo.enabled=[true], cache.ttl=[5m0s]], add_cloud_metadata={}
2020-12-01T17:34:29.533+0100 INFO [seccomp] seccomp/seccomp.go:124 Syscall filter successfully installed
2020-12-01T17:34:29.533+0100 INFO [beat] instance/beat.go:981 Beat info {"system_info": {"beat": {"path": {"config": "/etc/filebeat", "data": "/var/lib/filebeat", "home": "/usr/share/filebeat", "logs": "/var/log/filebeat"}, "type": "filebeat", "uuid": "0aaa1256-9d80-49a6-883d-581e6774bbf2"}}}
2020-12-01T17:34:29.533+0100 INFO [beat] instance/beat.go:990 Build info {"system_info": {"build": {"commit": "1428d58cf2ed945441fb2ed03961cafa9e4ad3eb", "libbeat": "7.10.0", "time": "2020-11-09T19:57:04.000Z", "version": "7.10.0"}}}
2020-12-01T17:34:29.534+0100 INFO [beat] instance/beat.go:993 Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":4,"version":"go1.14.7"}}}
2020-12-01T17:34:29.535+0100 INFO [beat] instance/beat.go:997 Host info {"system_info": {"host": {"architecture":"x86_64","boot_time":"2020-06-09T15:03:54+02:00","containerized":false,"name":"ppr-re1","ip":["127.0.0.1/8","::1/128","172.25.2.24/16","fe80::e452:48ff:fefe:313c/64"],"kernel_version":"4.15.18-29-pve","mac":["e6:52:48:fe:31:3c"],"os":{"family":"debian","platform":"debian","name":"Debian GNU/Linux","version":"10 (buster)","major":10,"minor":0,"patch":0,"codename":"buster"},"timezone":"CET","timezone_offset_sec":3600,"id":"523d7719f3dc4d188dfc6cd84db2e4c1"}}}
...
2020-12-01T17:34:29.535+0100 INFO instance/beat.go:299 Setup Beat: filebeat; Version: 7.10.0
2020-12-01T17:34:29.535+0100 DEBUG [beat] instance/beat.go:325 Initializing output plugins
2020-12-01T17:34:29.535+0100 DEBUG [publisher] pipeline/consumer.go:148 start pipeline event consumer
2020-12-01T17:34:29.535+0100 INFO [publisher] pipeline/module.go:113 Beat name: ppr-re1
2020-12-01T17:34:29.536+0100 WARN beater/filebeat.go:178 Filebeat is unable to load the Ingest Node pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the Ingest Node pipelines or are using Logstash pipelines, you can ignore this warning.
2020-12-01T17:34:29.536+0100 INFO instance/beat.go:455 filebeat start running.
2020-12-01T17:34:29.536+0100 DEBUG [test] registrar/migrate.go:304 isFile(/var/lib/filebeat/registry) -> false
2020-12-01T17:34:29.536+0100 DEBUG [test] registrar/migrate.go:304 isFile() -> false
2020-12-01T17:34:29.536+0100 DEBUG [test] registrar/migrate.go:297 isDir(/var/lib/filebeat/registry/filebeat) -> true
2020-12-01T17:34:29.536+0100 DEBUG [test] registrar/migrate.go:304 isFile(/var/lib/filebeat/registry/filebeat/meta.json) -> true
2020-12-01T17:34:29.544+0100 DEBUG [registrar] registrar/migrate.go:84 Registry type '1' found
2020-12-01T17:34:29.929+0100 INFO memlog/store.go:119 Loading data file of '/var/lib/filebeat/registry/filebeat' succeeded. Active transaction id=1118112382
2020-12-01T17:34:30.428+0100 INFO memlog/store.go:124 Finished loading transaction log file for '/var/lib/filebeat/registry/filebeat'. Active transaction id=1118139172
2020-12-01T17:34:30.429+0100 WARN beater/filebeat.go:381 Filebeat is unable to load the Ingest Node pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the Ingest Node pipelines or are using Logstash pipelines, you can ignore this warning.
2020-12-01T17:34:30.572+0100 INFO [registrar] registrar/registrar.go:109 States Loaded from registrar: 25066
2020-12-01T17:34:30.572+0100 INFO [crawler] beater/crawler.go:71 Loading Inputs: 2
2020-12-01T17:34:30.572+0100 DEBUG [registrar] registrar/registrar.go:140 Starting Registrar
2020-12-01T17:34:30.575+0100 DEBUG [input] log/config.go:207 recursive glob enabled
2020-12-01T17:34:30.603+0100 DEBUG [input] log/input.go:169 exclude_files: []. Number of states: 25066
2020-12-01T17:34:30.603+0100 DEBUG [input] file/states.go:68 New state added for /opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/20201130013704_Ze3vb/stdOutErr_20201130013704_Ze3vb.log
2020-12-01T17:34:30.603+0100 DEBUG [acker] beater/acker.go:59 stateful ack {"count": 1}
2020-12-01T17:34:30.603+0100 DEBUG [publisher] pipeline/client.go:231 Pipeline client receives callback 'onFilteredOut' for event: {Timestamp:0001-01-01 00:00:00 +0000 UTC Meta:null Fields:null Private:{Id:native::1613673-75 PrevId: Finished:true Fileinfo:<nil> Source:/opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/20201130013704_Ze3vb/stdOutErr_20201130013704_Ze3vb.log Offset:643 Timestamp:2020-11-30 03:32:44.059841752 +0100 CET TTL:-1ns Type:log Meta:map[] FileStateOS:1613673-75 IdentifierName:native} TimeSeries:false}
2020-12-01T17:34:30.603+0100 DEBUG [input] file/states.go:68 New state added for /opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/20201130070106_1DofO/stdOutErr_20201130070106_1DofO.log
2020-12-01T17:34:30.603+0100 DEBUG [acker] beater/acker.go:59 stateful ack {"count": 1}
2020-12-01T17:34:30.603+0100 DEBUG [publisher] pipeline/client.go:231 Pipeline client receives callback 'onFilteredOut' for event: {Timestamp:0001-01-01 00:00:00 +0000 UTC Meta:null Fields:null Private:{Id:native::1701764-75 PrevId: Finished:true Fileinfo:<nil> Source:/opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/20201130070106_1DofO/stdOutErr_20201130070106_1DofO.log Offset:840 Timestamp:2020-11-30 08:35:38.973145462 +0100 CET TTL:-1ns Type:log Meta:map[] FileStateOS:1701764-75 IdentifierName:native} TimeSeries:false}
2020-12-01T17:34:30.603+0100 DEBUG [input] file/states.go:68 New state added for /opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/20201130032011_41Ax3/stdOutErr_20201130032011_41Ax3.log
2020-12-01T17:34:30.603+0100 DEBUG [acker] beater/acker.go:59 stateful ack {"count": 1}
2020-12-01T17:34:30.603+0100 DEBUG [registrar] registrar/registrar.go:264 Processing 1 events
2020-12-01T17:34:30.603+0100 DEBUG [registrar] registrar/registrar.go:231 Registrar state updates processed. Count: 1
2020-12-01T17:34:30.604+0100 DEBUG [registrar] registrar/registrar.go:254 Registrar states cleaned up. Before: 25066, After: 25066, Pending: 0
2020-12-01T17:34:30.606+0100 DEBUG [registrar] registrar/registrar.go:201 Registry file updated. 25066 active states.
2020-12-01T17:34:55.130+0100 DEBUG [registrar] registrar/registrar.go:264 Processing 1 events
2020-12-01T17:34:55.130+0100 DEBUG [registrar] registrar/registrar.go:231 Registrar state updates processed. Count: 1
2020-12-01T17:34:55.131+0100 DEBUG [publisher] pipeline/client.go:231 Pipeline client receives callback 'onFilteredOut' for event: {Timestamp:0001-01-01 00:00:00 +0000 UTC Meta:null Fields:null Private:{Id:native::1682980-75 PrevId: Finished:true Fileinfo:<nil> Source:/opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/20201130032011_41Ax3/stdOutErr_20201130032011_41Ax3.log Offset:858 Timestamp:2020-11-30 06:09:26.980717504 +0100 CET TTL:-1ns Type:log Meta:map[] FileStateOS:1682980-75 IdentifierName:native} TimeSeries:false}
2020-12-01T17:34:55.131+0100 DEBUG [input] file/states.go:68 New state added for /opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/20201129172904_AkNEp/stdOutErr_20201129172904_AkNEp.log
2020-12-01T17:34:55.131+0100 DEBUG [acker] beater/acker.go:59 stateful ack {"count": 1}
2020-12-01T17:34:55.131+0100 DEBUG [registrar] registrar/registrar.go:254 Registrar states cleaned up. Before: 25066, After: 25066, Pending: 0
2020-12-01T17:34:55.133+0100 DEBUG [registrar] registrar/registrar.go:201 Registry file updated. 25066 active states.
2020-12-01T17:35:23.867+0100 DEBUG [registrar] registrar/registrar.go:264 Processing 1 events
2020-12-01T17:35:23.867+0100 DEBUG [registrar] registrar/registrar.go:231 Registrar state updates processed. Count: 1
2020-12-01T17:35:23.867+0100 DEBUG [publisher] pipeline/client.go:231 Pipeline client receives callback 'onFilteredOut' for event: {Timestamp:0001-01-01 00:00:00 +0000 UTC Meta:null Fields:null Private:{Id:native::1475867-75 PrevId: Finished:true Fileinfo:<nil> Source:/opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/20201129172904_AkNEp/stdOutErr_20201129172904_AkNEp.log Offset:825 Timestamp:2020-11-29 18:28:39.912167866 +0100 CET TTL:-1ns Type:log Meta:map[] FileStateOS:1475867-75 IdentifierName:native} TimeSeries:false}
2020-12-01T17:35:23.867+0100 DEBUG [input] file/states.go:68 New state added for /opt/TalendRemoteEngine/TalendJobServersFiles/jobexecutions/logs/20201128221803_H641V/stdOutErr_20201128221803_H641V.log
2020-12-01T17:35:23.867+0100 DEBUG [acker] beater/acker.go:59 stateful ack {"count": 1}
2020-12-01T17:35:23.867+0100 DEBUG [registrar] registrar/registrar.go:254 Registrar states cleaned up. Before: 25066, After: 25066, Pending: 0
2020-12-01T17:35:23.868+0100 DEBUG [registrar] registrar/registrar.go:201 Registry file updated. 25066 active states.
When activating metrics, I got this kind of results
Dec 01 18:13:13 ppr-re1 filebeat[22912]: 2020-12-01T18:13:13.398+0100 INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cgroup":{"memory":{"mem":{"usage":{"bytes":303104}}}},"cpu":{"system":{"ticks":1190,"time":{"ms":553}},"total":{"ticks":4750,"time":{"ms":1653},"value":4750},"user":{"ticks":3560,"time":{"ms":1100}}},"handles":{"limit":{"hard":524288,"soft":1024},"open":11},"info":{"ephemeral_id":"76dde668-e7f7-4fdc-9657-2cca6c6083a5","uptime":{"ms":63174}},"memstats":{"gc_next":119859488,"memory_alloc":111096304,"memory_total":422779136,"rss":245760},"runtime":{"goroutines":20}},"filebeat":{"events":{"added":1,"done":1},"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":1,"events":{"active":1,"filtered":1,"total":1}}},"registrar":{"states":{"current":25066,"update":1},"writes":{"success":1,"total":1}},"system":{"load":{"1":4.82,"15":4.86,"5":5.15,"norm":{"1":1.205,"15":1.215,"5":1.2875}}}}}}
Seems weird as we see "harvester":{"open_files":0,"running":0}}
I can't find what is blocking filebeat.
If I delete registry and let filebeat recreate a new one, it harvests all files and I get my logs back.
Any help would be appreciated.