Filebeat Version: beta1
Logstash Version: alpha2
In my test scenario filebeat's processing around 1k files that are something around 40GB a day in total. The problem is that upon every restart of filebeat some (but not full) subset of files is send again for no apparent reason.
My initial idea here was that during filebeat shutdown some events were unacked by logstash and filebeat just shuts down too fast without waiting for them. But that seems to be impossible since number of events that are replied is few order of magnitude bigger than my original queue_size number.
All files are updated in last <24hrs. The only thing that Filebeat says about file that are rescanned is this upon every restart:
2016-09-29T12:16:02Z INFO Harvester started for file: /path/to/file
Prospector and other potentially related settings I have are:
clean_inactive: 48h clean_removed: true close_inactive: 30m close_removed: true ignore_older: 24h tail_files: false