I had created another topic about another issue with the same use-case described here : https://discuss.elastic.co/t/very-long-to-load-prospectors-and-start-harvesting/86625/8
I have done several tests and I prefer to merge all that stuff in this topic.
On a test server, I have indexed all the files "old" using the -once
option and the following config (the ignore_older
property is set to only index files after a given date):
filebeat.prospectors:
- input_type: log
paths:
- U:\foo\bar\*_TO_TARGET\*.csv
- U:\foo\bar\directory1\export\*.CSV
- U:\foo\bar\directory2\export\*.csv
encoding: utf-8
document_type: type_A
scan_frequency: 30s
ignore_older: 4080h
close_eof: true
- input_type: log
paths:
- U:\foo\bar\directory3\export\*.xml
encoding: utf-8
document_type: type_A
scan_frequency: 30s
ignore_older: 4080h
close_eof: true
exclude_lines: [ '^<\?xml', '^<Document' ]
multiline:
pattern: '^[[:space:]]*<Node'
negate: true
match: after
max_lines: 5000
- input_type: log
paths:
- U:\foo\bar\TARGET_TO_*\*.csv
encoding: utf-8
document_type: type_B
scan_frequency: 30s
ignore_older: 4080h
close_eof: true
- input_type: log
paths:
- T:\bar\foo\*.qid
encoding: utf-8
document_type: type_C
scan_frequency: 30s
ignore_older: 4080h
close_eof: true
multiline:
pattern: ^EXENAME
negate: true
match: after
output.logstash:
hosts: ["localhost:5044"]
All the source directories are Windows mounted drives on shares directories from two distinct servers.
After indexing the "old" files, I started filebeat without the -once
option and with quite the same config as the previous one except on this :
ignore_older: 4080h
replaced by this :
ignore_older: 10m
clean_inactive: 15m
The registry file size is now 40MB.
I do not encounter the "very long prospector loading" issue when restarting filebeat as described in the other topic.
But I have sometimes a quite big delay between the time a new file is copied to a watched directory and the time it is received by logstash (which is installed on the same server as filebeat for test purpose). The delay can be up to 15 minutes.
How can I reduce this delay ?