The file beat offset value refers to the pointer of how much of the current log file have been read.
My understanding of what happens i the following scenarios are:
A log file is rotated (zipped) and archived -> offset 0
A log file is truncated -> offset 0
A log file is partially edited, someone removes a couple of lines or restores an older log file -> offset ???
In the last scenario I want to find out what the file beat client will say, will it detecting this as someone tampering with the file or will it silently decrease the offset pointer to the max offset available in the file?
There are 2 cases for the offset if it is the same file:
offsetNew >= offsetOld: continue reading
offsetNew < offsetOld: file was truncated, starts reading from the beginning
Filebeat does not have any knowledge of the content of the file so it cannot decide if the file was truncated and already some lines added or a line somewhere in the file was shortened.
By the way, what about the corner case where a log file is rotated and contains the exact same number of bytes as previous file did last time a file beat client read from it?
Am I correct in assuming that in this special case the file beat client would only report data beyond the previously reported byte offset. It does not analyze the actual content or last modified date in any way?
Filebeat detects file rotation and knows it is the same file. So nothing happens.
There can be some edge cases where it would not read the content again, but I haven't heard of this happening so far. We are working on the idea to use Fingerprints for files, which would detect such changes.
The problem with the ModTime is, what should be done if the ModTime did change but the offset not? Was the file perhaps only touched?
Good, that it detects file rotation. The other things are really edge cases as you say. It is good to know where the limitations are to be able to implement policies as to what is allowed or not when humans are working with log files.
Hi~
I have a small question about the way how the registry file updates when the monitoring log file did not change for a long time.
Usually the registry file will update when the monitoring log file changed, but I noticed that the registry file would update even if the monitoring file did not change. And I am a little confused about how the registry updates?
Registry file is updated if any file is updated or renamed in 5.0. Do you see a different behaviour? Best is having a look at the log files. If there are things happening, it means things are changing and potentially states are updated.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.