I am running Filebeat 7.10 sending log events to logstash on Windows server 2016. As filebeat is not cleaning up logfiles I need to write my own script for this, but for this to work I need to know when the harvesting is done for each file.
I found that the filebeat logfile is writing the line below. Does this mean that the entire file has been harvested and the file can be deleted? Or are there any other safe method for getting this information?
2020-12-08T21:52:35.024+0100 DEBUG [harvester] log/log.go:107 End of file reached: D:\filebeatlogfolder\audit.20200821T202349.log; Backoff now.
That is one way.
You should also look at things like close_inactive and clean_inactive to remove them from the Filebeat registry and will allow you to then safely delete them.
When I inspect the log output for one logfile is below. And the last line seems to be: Harvester cleanup finished for file: . Is this row a better choice for determine if a file is harvested 100% and could be deleted?
2020-12-11T09:33:54.607+0100 DEBUG [input] log/input.go:439 Check file for harvesting: D:\filebeatlogfolder\audit.20201210T023449.log
2020-12-11T09:33:54.608+0100 DEBUG [input] log/input.go:530 Update existing file for harvesting: D:\filebeatlogfolder\audit.20201210T023449.log, offset: 52430048
2020-12-11T09:33:54.608+0100 DEBUG [input] log/input.go:539 Resuming harvesting of file: D:\filebeatlogfolder\audit.20201210T023449.log, offset: 52430048, new size: 52430049
2020-12-11T09:33:54.608+0100 DEBUG [harvester] log/harvester.go:578 Set previous offset for file: D:\filebeatlogfolder\audit.20201210T023449.log. Offset: 52430048
2020-12-11T09:33:54.608+0100 DEBUG [harvester] log/harvester.go:569 Setting offset for file: D:\filebeatlogfolder\audit.20201210T023449.log. Offset: 52430048
2020-12-11T09:33:54.614+0100 DEBUG [harvester] log/harvester.go:488 Update state: D:\filebeatlogfolder\audit.20201210T023449.log, offset: 52430048
2020-12-11T09:33:54.614+0100 INFO log/harvester.go:302 Harvester started for file: D:\filebeatlogfolder\audit.20201210T023449.log
2020-12-11T09:33:54.614+0100 INFO log/harvester.go:331 End of file reached: D:\filebeatlogfolder\audit.20201210T023449.log. Closing because close_eof is enabled.
2020-12-11T09:33:54.614+0100 DEBUG [harvester] log/harvester.go:604 Stopping harvester for file: D:\filebeatlogfolder\audit.20201210T023449.log
2020-12-11T09:33:54.614+0100 DEBUG [harvester] log/harvester.go:614 Closing file: D:\filebeatlogfolder\audit.20201210T023449.log
2020-12-11T09:33:54.627+0100 DEBUG [harvester] log/harvester.go:488 Update state: D:\filebeatlogfolder\audit.20201210T023449.log, offset: 52430048
2020-12-11T09:33:54.627+0100 DEBUG [harvester] log/harvester.go:625 harvester cleanup finished for file: D:\filebeatlogfolder\audit.20201210T023449.log
When looking at the log a bit more I found that filebeat is saving the final offset to 1 byte less then the actual filesize.
The file audit.20201210T115155.log in the log output below is 52429144 bytes on disk, but filebeat is saving the offset as 52429143. When filebeat is checking the file for harvesting later it compares the recorded offset to the actual file size and find that they don't match. And then starts to harvest the file. And it will continue like this forever.
I am running filebeat 7.10 on windows.
2020-12-11T11:08:34.250+0100 DEBUG [input] log/input.go:439 Check file for harvesting: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:08:34.250+0100 DEBUG [input] log/input.go:530 Update existing file for harvesting: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143
2020-12-11T11:08:34.250+0100 DEBUG [input] log/input.go:539 Resuming harvesting of file: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143, new size: 52429144
2020-12-11T11:08:34.250+0100 DEBUG [harvester] log/harvester.go:578 Set previous offset for file: D:\filebeatlogfolder\audit.20201210T115155.log. Offset: 52429143
2020-12-11T11:08:34.250+0100 DEBUG [harvester] log/harvester.go:569 Setting offset for file: D:\filebeatlogfolder\audit.20201210T115155.log. Offset: 52429143
2020-12-11T11:08:34.250+0100 DEBUG [harvester] log/harvester.go:488 Update state: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143
2020-12-11T11:08:34.250+0100 INFO log/harvester.go:302 Harvester started for file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:08:34.250+0100 INFO log/harvester.go:331 End of file reached: D:\filebeatlogfolder\audit.20201210T115155.log. Closing because close_eof is enabled.
2020-12-11T11:08:34.250+0100 DEBUG [harvester] log/harvester.go:604 Stopping harvester for file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:08:34.251+0100 DEBUG [harvester] log/harvester.go:614 Closing file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:08:34.251+0100 DEBUG [harvester] log/harvester.go:488 Update state: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143
2020-12-11T11:08:34.251+0100 DEBUG [harvester] log/harvester.go:625 harvester cleanup finished for file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:10:34.477+0100 DEBUG [input] log/input.go:439 Check file for harvesting: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:10:34.478+0100 DEBUG [input] log/input.go:530 Update existing file for harvesting: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143
2020-12-11T11:10:34.478+0100 DEBUG [input] log/input.go:539 Resuming harvesting of file: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143, new size: 52429144
2020-12-11T11:10:34.478+0100 DEBUG [harvester] log/harvester.go:578 Set previous offset for file: D:\filebeatlogfolder\audit.20201210T115155.log. Offset: 52429143
2020-12-11T11:10:34.478+0100 DEBUG [harvester] log/harvester.go:569 Setting offset for file: D:\filebeatlogfolder\audit.20201210T115155.log. Offset: 52429143
2020-12-11T11:10:34.478+0100 DEBUG [harvester] log/harvester.go:488 Update state: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143
2020-12-11T11:10:34.478+0100 INFO log/harvester.go:302 Harvester started for file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:10:34.478+0100 INFO log/harvester.go:331 End of file reached: D:\filebeatlogfolder\audit.20201210T115155.log. Closing because close_eof is enabled.
2020-12-11T11:10:34.478+0100 DEBUG [harvester] log/harvester.go:604 Stopping harvester for file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:10:34.478+0100 DEBUG [harvester] log/harvester.go:614 Closing file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:10:34.478+0100 DEBUG [harvester] log/harvester.go:488 Update state: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143
2020-12-11T11:10:34.478+0100 DEBUG [harvester] log/harvester.go:625 harvester cleanup finished for file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:12:34.544+0100 DEBUG [input] log/input.go:439 Check file for harvesting: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:12:34.544+0100 DEBUG [input] log/input.go:530 Update existing file for harvesting: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143
2020-12-11T11:12:34.544+0100 DEBUG [input] log/input.go:539 Resuming harvesting of file: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143, new size: 52429144
2020-12-11T11:12:34.545+0100 DEBUG [harvester] log/harvester.go:578 Set previous offset for file: D:\filebeatlogfolder\audit.20201210T115155.log. Offset: 52429143
2020-12-11T11:12:34.545+0100 DEBUG [harvester] log/harvester.go:569 Setting offset for file: D:\filebeatlogfolder\audit.20201210T115155.log. Offset: 52429143
2020-12-11T11:12:34.546+0100 DEBUG [harvester] log/harvester.go:488 Update state: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143
2020-12-11T11:12:34.546+0100 INFO log/harvester.go:302 Harvester started for file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:12:34.546+0100 INFO log/harvester.go:331 End of file reached: D:\filebeatlogfolder\audit.20201210T115155.log. Closing because close_eof is enabled.
2020-12-11T11:12:34.546+0100 DEBUG [harvester] log/harvester.go:604 Stopping harvester for file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:12:34.546+0100 DEBUG [harvester] log/harvester.go:614 Closing file: D:\filebeatlogfolder\audit.20201210T115155.log
2020-12-11T11:12:34.546+0100 DEBUG [harvester] log/harvester.go:488 Update state: D:\filebeatlogfolder\audit.20201210T115155.log, offset: 52429143
2020-12-11T11:12:34.546+0100 DEBUG [harvester] log/harvester.go:625 harvester cleanup finished for file: D:\filebeatlogfolder\audit.20201210T115155.log
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.