Is filebeat relying on modification time of the file (windows) or to its internal mod time when computing the ignore_older?
I do really think this could be the point: as you say it is not reliable in windows and in my case I found many file with a modtime older than it's real content
For ignore_older, it relies on the file mod_time. For all the close_* config options it relies on the internal state. The issue with the mod time on Windows is the reason we switched from comparing the mod time to comparing the size when reading a file. So in case a file gets additional content but modtime is not updated, the new content will still be harvested by filebeat.
I was hoping we would not face this issue with ignore_older as I kind of assumed the updated meta data is only flushed with a delay, but it seems like it is only flushed when closing the file. There is not really a solution we can do for other logs that are written, but I think we should at least have a solution for our own log file to make sure meta data is updated frequently (like every 5 minutes). I need to dig deeper here on what kind of write options exist here for Windows. @andrewkroh Any ideas?
that's interesting but it seems kind of a bug to me
why using two metrics? what about using the same metrics for close and ignore older?
it seems to me that having 2 different metrics will always be a problem: ignore_older must always be greater than close_* otherwise an error happen, but they may not be comparable!
filebeat file has been created on the 21st as soon as filebeat.1 has been closed
as you can see from log it is growing (last line at 16.30) but no modified date change
The difference is on purpose. All the close_* options are checked when the file is open and need the state information for comparison. The ignore_older option happens before a file handler is opened and does not rely on the internal state. ignore_older can also be applied, if there is no existing state, means no state has to be stored for old files.
Normally ignore_older should be a magnitude larger then close_inactive which is the case for you so this error will never show up. Unfortunately there are these issue with Windows.
I see your point that the error message can be confusing. The problem with the above suggestion is, that people will configure ingore_older to be 24h and will start complain, if it doesn't get ignored after 24h. Now there is at least an error message that tells them that there is something "wrong".
The reason we log an error is that I think this is also kind of a "bug" on the windows file system. Modtime should be when a file was modified, not closed after modified. Obviously there will be lots of people disagreeing here
I'm thinking of making it an Info message instead of an Err message. This makes it more obvious that in most cases no direct actions are needed.
I don't think people will complain: you are just overcoming a bug of OS (maybe only in windows) with a correct solution: this would provide the expected file close and ignore after all
if a file contains logs of now I don't want it to be ignored at all or I will loose lines. The important thing is to say that ignore_older relies on the last written line not on the OS time.
nobody will expect to ignore files that are being written now, I would complain if you try to ignore because I may be loosing lines
just assume I have a close policy of 5 mins and ignore of 24h on bugged file with no moddate update: in case logger stops writing lines may be ignored after 5 mins of waiting, thus loosing next lines
However I think it is a good practice for any software to be as agnostic and independent as possible from external environment: especially when it is known to be full of errors (without doing names )
As a software developer I don't want that an error on external environment creates an error on my one, I feel responsible for what my software does
Sorry for the late answer. It would be great, if we could figure out when the last line was written. Unfortunately the internal state we are having is, when the last line was read by us.
One first step I did for future release is to already "downgrade" the message from Error to Info as otherwise it raises expectations that something is really wrong. But this is not the case, as in the end filebeat is doing "the right thing" and finishes reading the file.
I will definitively keep an eye on this one to see if we can find some improvements, so we can rely on the timestamp from windows and are not required to do some additional windows magic (we already have enough of it in filebeat ).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.