Need a logical end of file definition for filebeat

I have problem with the log that defines end of file by a string, the end of file can be in the middle of the log, So filebeat needs to stop reading log at that line instead of reading all the way down to physical end-of-file.

For example:

2016-02-04 11:42:06.279-05:00 i360imsa UUMSyncWebService Verbose 0 Successfully
2016-02-04 11:42:06.279-05:00 i360imsa UUMSyncWebService Verbose 0 Process Sync Message - 2016-02-04 11:42:06.328-05:00 i360imsa UUMSyncWebService Verbose 0 GetAllPBXIDDuplicates was called
2016-02-04 11:42:06.447-05:00 i360imsa UUMSyncWebService Verbose 0 Successfully returnning
2016-02-04 11:42:06.447-05:00 i360imsa UUMSyncWebService Informational 0 Process Sync Messages
Logger Information: EOF
2016-02-01 11:42:06.447-05:00 i360imsa UUMSyncWebService Informational 0 Process Sync Messages

Logger Information: EOF is end of log file. new lines of log will be inserted before this line. If the log file is full, it will overwrite the log from the beginning.

Could you please help me workaround this case? currently, filebeat just reads all the lines and it cannot know the new lines are inserted in the middle of the log file.

Interesting. Is there some content after Logger Information: EOF in the file? Is this line overwritten when new lines are added and put at new end of file? Or there other Logger Information messages?

This use-case is not supported by beats yet, but feel free to add an enhancement request with some more details to github.com/elastic/beats.

Can you share some more information on what kind of logging system this is?

Yes, there are bunch of lines after Logger Information: EOF in the file. when the new lines are added, it will move the Logger Information: EOF down. All new lines are inserted before this line. the lines after this line will be overwritten.
When the file is full, reach the max (for example 200Mb), it will start over from the beginning of the file, but all the old lines still there after Logger Information: EOF .

Simply says: it re-cycles the file, but not clear the content when the file is full.

when the file is full and the line Logger Information: EOF is at the end of file, it will start over from the beginning of the file. So this line likes a marker to tell the logger that this is the point to write new log lines, not write at the physical end of file

@tringuyen It would be interesting to know what kind of application creates these kind of log files. It somehow sounds also going in the direction of unifiedbeat or circular log files: https://github.com/cleesmith/unifiedbeat

Thank you for asking @ruflin.

This is our own company logger framework. I looked at unifiedbeat, but it is not the same. it is circular log file. the system just creates 1 log file and circular it.

This can be the algorithm:

  • Add a new attribute to yalm file, let say - EOF: a string. by default, it is nothing and filebeat will read the log to end of the file.
    -If it has a string defined for EOF, filebeat will read the file until it reaches this string.
    -Circular case: If it cannot find the string at the end of the file, it can go back to the beginning of the file and read until it reaches the string.

that' not enough + on filebeat restart we have to figure out where to continue from. As file may have rotated since last read and there are no meta-data in file to discover file being rotated + old offset is valid starting from X it would be hard create a simple generic + robust config dealing with this "fileformat".

Thanks @steffens.

Yes, you are right. we need to handle filebeat restarted case too. But I think this handle is the same for all kind of logs. The file is rotated when filebeat reads from the last point to the end of the file and it cannot see the EOF string.

@tringuyen Are there any other log file reader that support this kind of format (including rotation). Would be perhaps interesting to have a look to see how they do it.

unfortunately, I am still find it @ruflin.

Let me know if you find one that does.

Problem is, there is not enough meta-data in file to find a good restarting point. You basically have to scan your file for the EOF marker and check if old offset is still correct. If file was rotated more than once between filebeat restarts, offset is always wrong (but there is no way to detect this).

when it restarts , it should read from the point it left. if more the once rotation, we lost data - no way for this situation. so the file should big enough and filebeat off-time should be short enough.

Thank you so much for your all replies