Hi,
Filebeat Version: beta1
Logstash Version: alpha2
System: CentOS6
In my test scenario filebeat's processing around 1k files that are something around 40GB a day in total. The problem is that upon every restart of filebeat some (but not full) subset of files is send again for no apparent reason.
My initial idea here was that during filebeat shutdown some events were unacked by logstash and filebeat just shuts down too fast without waiting for them. But that seems to be impossible since number of events that are replied is few order of magnitude bigger than my original queue_size number.
All files are updated in last <24hrs. The only thing that Filebeat says about file that are rescanned is this upon every restart:
2016-09-29T12:16:02Z INFO Harvester started for file: /path/to/file
Prospector and other potentially related settings I have are:
clean_inactive: 48h
clean_removed: true
close_inactive: 30m
close_removed: true
ignore_older: 24h
tail_files: false