File Beat offset semantics

Hi

The file beat offset value refers to the pointer of how much of the current log file have been read.

My understanding of what happens i the following scenarios are:

  1. A log file is rotated (zipped) and archived -> offset 0
  2. A log file is truncated -> offset 0
  3. A log file is partially edited, someone removes a couple of lines or restores an older log file -> offset ???

In the last scenario I want to find out what the file beat client will say, will it detecting this as someone tampering with the file or will it silently decrease the offset pointer to the max offset available in the file?

/Magnus

There are 2 cases for the offset if it is the same file:

  • offsetNew >= offsetOld: continue reading
  • offsetNew < offsetOld: file was truncated, starts reading from the beginning

Filebeat does not have any knowledge of the content of the file so it cannot decide if the file was truncated and already some lines added or a line somewhere in the file was shortened.

OK thanks. That makes it clear.

By the way, what about the corner case where a log file is rotated and contains the exact same number of bytes as previous file did last time a file beat client read from it?

Am I correct in assuming that in this special case the file beat client would only report data beyond the previously reported byte offset. It does not analyze the actual content or last modified date in any way?

Filebeat detects file rotation and knows it is the same file. So nothing happens.

There can be some edge cases where it would not read the content again, but I haven't heard of this happening so far. We are working on the idea to use Fingerprints for files, which would detect such changes.

The problem with the ModTime is, what should be done if the ModTime did change but the offset not? Was the file perhaps only touched?

Good, that it detects file rotation. The other things are really edge cases as you say. It is good to know where the limitations are to be able to implement policies as to what is allowed or not when humans are working with log files.

Hi~
I have a small question about the way how the registry file updates when the monitoring log file did not change for a long time.
Usually the registry file will update when the monitoring log file changed, but I noticed that the registry file would update even if the monitoring file did not change. And I am a little confused about how the registry updates?

Thanks.

@thatscode It also changes if there are file rotations. And there are subtle differences between 1.x and 5.x. Which one are you referring to?

Thanks for your reply~:slight_smile:
The version I am using is filebeat-5.0.0-1.x86_64, I installed filebeat via rpm package.

My Platform information is below:
CentOS release 6.5~6.8

kernel versions:
2.6.32-431.el6.x86_64
2.6.32-504.el6.x86_64
2.6.32-573.el6.x86_64
2.6.32-642.el6.x86_64

Thanks.

Registry file is updated if any file is updated or renamed in 5.0. Do you see a different behaviour? Best is having a look at the log files. If there are things happening, it means things are changing and potentially states are updated.