@michaelbu thanks a lot for all the information!
The good news is that I managed to reproduce a similar behaviour:
- Filestream input is reading any file
- The file gets truncated (`truncate --size 0 /path/to/file)
- Filestream detects the truncation and logs accordingly
- The file size is not reset in the cursor (registry)
- However,
stat
shows the correct size (zero, then it starts increasing again)
The key differences I see from my environment to yours:
- I'm running Arch Linux
- I'm using ext4 instead of xfs
Without restarting Filebeat I did not perceive any data duplication.
This is a bug in Filestream, the cursor in the registry should be updated when the file is truncated. I'll further investigate next week.
One interesting thing I noticed on your Filebeat configuration is that the input who reads Filebeat logs drops the events mentioning file truncation, here is the relevant snippet
processors:
- drop_event:
when:
or:
- equals:
message: "File was truncated. Reading file from offset 0. Path=/var/log/graylog-sidecar/filebeat_stderr.log"
- and:
- equals:
log.level: "info"
- not:
contains:
message: "File was truncated. Reading file from offset "
Did you write this Filebeat configuration or is it "auto-generated" by GrayLog?
My current theory is that GrayLog is somehow truncating the file.
However I'm very puzzled that neither stat
nor your script is showing this.
Could you share your monitoring scrip? I would like to run some tests using the same script.