Filestream input sends duplicates events on restart and during operation

@michaelbu thanks a lot for all the information!

The good news is that I managed to reproduce a similar behaviour:

  • Filestream input is reading any file
  • The file gets truncated (`truncate --size 0 /path/to/file)
  • Filestream detects the truncation and logs accordingly
  • The file size is not reset in the cursor (registry)
  • However, stat shows the correct size (zero, then it starts increasing again)

The key differences I see from my environment to yours:

  • I'm running Arch Linux
  • I'm using ext4 instead of xfs

Without restarting Filebeat I did not perceive any data duplication.

This is a bug in Filestream, the cursor in the registry should be updated when the file is truncated. I'll further investigate next week.

One interesting thing I noticed on your Filebeat configuration is that the input who reads Filebeat logs drops the events mentioning file truncation, here is the relevant snippet

  processors:
  - drop_event:
      when:
        or:
          - equals:
              message: "File was truncated. Reading file from offset 0. Path=/var/log/graylog-sidecar/filebeat_stderr.log"
          - and:
            - equals:
                log.level: "info"
            - not:
                contains:
                  message: "File was truncated. Reading file from offset "

Did you write this Filebeat configuration or is it "auto-generated" by GrayLog?

My current theory is that GrayLog is somehow truncating the file.

However I'm very puzzled that neither stat nor your script is showing this.

Could you share your monitoring scrip? I would like to run some tests using the same script.

1 Like