Filestream input sends duplicates events on restart and during operation

That is really puzzling me because Filebeat does not truncate its log files, I've been reviewing our log rotation code today and the way it works in Beats, given rotation is needed, is:

  • The current file is closed
  • The new name is generated (the default format is: filebeat-20230612-16.ndjson, the 16 is a ever-increasing counter)
  • If needed, old files are deleted
  • The new file is created/opened.

This means Filebeat will not truncate its own log files.

I still need to investigate more, but today I managed to reproduce this ever-increasing cursor in the registry in a very peculiar situation:

  • Filebeat harvesting its own logs
  • I truncate the logs using truncate --size=0 /path/to/file
  • Most of the time, the cursor in the registry keeps increasing, however:
    • stat path/to/file show the new size as zero for a brief period of time, then it jumps back to what was in the registry
    • wc -l confirms that the file has been truncated and correctly shows the line count: 0 after truncation then slowly increasing.

I'm still investigating.