Filebeat re-opens a harvester for a inactive file

We are using filebeat on windows 7 to send logs to kafka. Filebeat version used is 5.2.2.

Filebeat.yml has config similar to following

- 
      paths:
        - somepath\*.txt
      input_type: log
      document_type: abc
      ignore_older: 12h
      close_inactive: 1m

In the filebeat logs, there is a line for closing a file, which is immediately followed by line saying that a harvester is started for the same file. Following set of logs are repeated until the time file is deleted. Basically a harvester is repeatedly opened and closed for the inactive file.

2017-05-08T14:09:21+05:30 INFO No non-zero metrics in the last 30s
2017-05-08T14:09:37+05:30 INFO File is inactive: FilePath\FileName. Closing because close_inactive of 1m0s reached.
2017-05-08T14:09:42+05:30 INFO Harvester started for file: FilePath\FileName
2017-05-08T14:09:51+05:30 INFO Non-zero metrics in the last 30s: registrar.states.update=2 filebeat.harvester.started=1 publish.events=2 filebeat.harvester.closed=1 registrar.writes=2

There is only file in the directory "FilePath", this file is not updated (I can not see any new lines in the file, also the file modified timestamp has not changed). Where as publish.events=2 suggests that this file is updated with two new records (this is my understanding of filebeat logs, could be totally wrong).

Registry file contains only one entry, similar to following

[{"source":"FilePath\FileName","offset":5087128,"FileStateOS":{"idxhi":167313408,"idxlo":90368,"vol":1023355418},"timestamp":"2017-05-08T14:15:18.8266703+05:30","ttl":-1}]

This file contains 5087232 character. 5087128 is the end of second last line.
Last line in this file does not contains line end character (CR-LF).

My understanding is that once the file is marked as inactive and no new changes are made to the file, no new harvester should be started. Am I missing something in prospector configuration or my understanding needs an update? Or absence of line end character in the last line is messing with filebeat?

The missing line ending is probably messing with filebeat. It is required otherwise the last event is never sent and it thinks there is still something to read. Any reason your logging system does not add the newline char?

Application writing these logs into file writes them in chunks of 8 KB. So any time while this application is running there is a possibility of having the last line without line end char. But in normal course at the end of a application run all the logs should have the line end char.
Also, this application runs in a time bound manner. So whenever due to some reason it overshoots the end time it is terminated (not gracefully sometimes), which may cause it to leave the lines incomplete without line end char.

So in case the file has a line ending, does filebeat work as intended?

TBH the above sounds more like also an issue in your logging system? Haven't encountered a similar system yet. What kind of system is that? Curious to hear more.

I agree with you that writing incomplete logs is a issue with our system. Basically we are running JMeter to benchmark system performance. On terminating JMeter midway we see such behavior.
With proper line ending filebeat works fine.
I became curious when saw this behavior and wanted to understand the logic behind this.

I actually never thought of this before that such a case could mean filebeat reopens files constantly. But it makes sense from the Filebeat perspective as it has to check if there is a now a more complete change. As you have ignore_older I assume it is not really an issue in your case as the only thing filebeat does it opening the file but nothing else.

I have a similar problem.
But i've configured multiline mode with timeout option:
multiline:
pattern: "^\["
negate: true
match: after
timeout: 15s

As I understand, filebeat should consume last line in 15s after last file modification, and never check this file again.
But I see tons of "File is inactive"/"Harvester started" for same files which was not modified.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.