I'm searching through 200000 files on a server (24 logical cores, 48GB RAM) and want to collect logs only for the previous 24 hours:
My filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- '/data/logs/A[0-9]*/*/scheduler/log.txt'
- '/data/logs/A[0-9]*/*/tables/*.txt'
- '/data/logs/A[0-9]*/*/server.txt'
- '/data/logs/A[0-9]*/*/errors_fatal.txt'
- '/data/logs/A[0-9]*/*/communicator/log.txt'
encoding: utf-8
multiline.pattern: '^\d{8} \d{2}:\d{2}:\d{2}\.\d{3}'
multiline.negate: true
multiline.match: after
close_renamed: true
harvester_limit: 1024
ignore_older: 24h
output.logstash:
hosts: ["log.ff.local:5044"]
The main problem is if I restart filebeat it doesn't detect new files as it busy with an old ones. I see the following in a debug log:
2018-12-14T12:56:49.542+0700 DEBUG [registrar] registrar/registrar.go:393 Registry file updated. 182331 states written.
2018-12-14T12:56:49.542+0700 DEBUG [registrar] registrar/registrar.go:345 Processing 1 events
2018-12-14T12:56:49.542+0700 DEBUG [registrar] registrar/registrar.go:315 Registrar state updates processed. Count: 1
2018-12-14T12:56:49.542+0700 DEBUG [acker] beater/acker.go:64 stateful ack {"count": 1}
2018-12-14T12:56:49.543+0700 DEBUG [input] file/states.go:68 New state added for /data/logs/A19/21001166 20181105 1604/tables/11_2949370.txt
2018-12-14T12:56:49.546+0700 DEBUG [registrar] registrar/registrar.go:335 Registrar states cleaned up. Before: 182331, After: 182331, Pending: 0
2018-12-14T12:56:49.572+0700 DEBUG [registrar] registrar/registrar.go:400 Write registry file: /var/lib/filebeat/registry
2018-12-14T12:56:50.257+0700 DEBUG [registrar] registrar/registrar.go:393 Registry file updated. 182331 states written.
2018-12-14T12:56:50.257+0700 DEBUG [registrar] registrar/registrar.go:345 Processing 1 events
2018-12-14T12:56:50.257+0700 DEBUG [registrar] registrar/registrar.go:315 Registrar state updates processed. Count: 1
2018-12-14T12:56:50.257+0700 DEBUG [acker] beater/acker.go:64 stateful ack {"count": 1}
2018-12-14T12:56:50.257+0700 DEBUG [input] file/states.go:68 New state added for /data/logs/A31/21001432 20181207 1827/tables/41_8323722.txt
2018-12-14T12:56:50.261+0700 DEBUG [registrar] registrar/registrar.go:335 Registrar states cleaned up. Before: 182331, After: 182331, Pending: 0
2018-12-14T12:56:50.280+0700 DEBUG [registrar] registrar/registrar.go:400 Write registry file: /var/lib/filebeat/registry
2018-12-14T12:56:51.008+0700 DEBUG [registrar] registrar/registrar.go:393 Registry file updated. 182331 states written.
2018-12-14T12:56:51.009+0700 DEBUG [registrar] registrar/registrar.go:345 Processing 1 events
2018-12-14T12:56:51.009+0700 DEBUG [registrar] registrar/registrar.go:315 Registrar state updates processed. Count: 1
2018-12-14T12:56:51.009+0700 DEBUG [acker] beater/acker.go:64 stateful ack {"count": 1}
2018-12-14T12:56:51.009+0700 DEBUG [input] file/states.go:68 New state added for /data/logs/A1019/21002965 20181117 0258/tables/4985_2556556.txt
2018-12-14T12:56:51.011+0700 DEBUG [registrar] registrar/registrar.go:335 Registrar states cleaned up. Before: 182331, After: 182331, Pending: 0
2018-12-14T12:56:51.029+0700 DEBUG [registrar] registrar/registrar.go:400 Write registry file: /var/lib/filebeat/registry
2018-12-14T12:56:51.725+0700 DEBUG [registrar] registrar/registrar.go:393 Registry file updated. 182331 states written.
So, according to the debug log, the maximum speed for registrar is 2 files per second and approximately it will take 1-2 days to check all files which is so long (even in comparison with deleting the registry file and restart filebeat). Is it possible to speed up filebeat registrar?
Also, is it okay that the 182331 states written
string is not changed and I see the same number for different files?