Read file from beginning always

I have a file which gets overwritten every 15 minutes, the filename does not change and remains the same. The scan frequency is set at 10s and currently the filebeat only reads the file every 1 hour and misses the previous updates.

How can i make filebeat to read the file from beginning every time? i can updated the scan frequency to 10 minutes and can afford to have the file read every time from beginning.

Tailing and truncating logs is not the best practice. It's subject to races and potential data loss.

Filebeat checks the file size. If the new file size is less, it treats the file as truncated and starts processing from the beginning. If filebeat picks up the file once the file size is the same or even bigger, then it thinks it is the old file. With overwriting I assume you overwrite all contents at once. That is, you don't give filebeat a chance to start in the middle of a write.

Yes , its basically a single line file having a key value pair which gets overwritten every 15 minutes. there will no additional updates.

I was looking at filebeat prospector option close_eof, can i use this option? But my concern is since the filename is not changed it might skip reading the file?

But my concern is since the filename is not changed it might skip reading the file?

If the file identity doesn't change at all, then filebeat might not pick it up. The file identity is defined by the files inode and device. You can reuse a path, if the inode changes.

If file identity does not change, it's unlikely filebeat will pick up the changed file. By chance you might get the new contents (if it's smaller then the original line), or only a few bytes that make no sense (if new contents is bigger).

Filebeat currently does not fingerprint the file contents. Therefore it can not deal with this use-case.

In general, you don't want to read while a file is being truncated. This use-case is too much riddled with races.

Thanks. Yes the changes are much negligible in bytes. Eg: key:value where value is a number which changes. in this case, i am thinking i should probably delete the file and recreate the file with every change so elk can read when it changes?

I'd recommend to not delete the original file, but replace it. Common sequence is:

  1. create temporary file
  2. write to temporary file
  3. flush + fsync (FlushFileBuffers on windows)
  4. close file
  5. replace original file by renaming temporary file
  6. Optional (depends on OS + FS, not sure about Windows) open and fsync parent directory (not supported by every FS, but sometimes required to update directory meta-data)

This sequence guarantees your file is not corrupted between restarts or on power failures (well, best effort).

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.