Filebeat do not see file updates after a while

ogauchard · November 18, 2015, 10:08am

Hello,

I am facing a strange issue with Filebeat on Windows :

I use it to consume log files generated with log4net. These files are configures to rotate every day. Filebeat is only watching the current file.

Everything works well (new lines are forwarded to Logstash) until the first hours of the day after tomorrow.

Every day between midnight and 2 or 3 o'clock Filebeat stop seeing file updates.

I think it is not due to file rotation because the first lines of the new day are well forwarded to Logstasg.

But after a while (20 minutes to 2 hours) nothing is forwarded anymore.

And if I restart the windows service, all skipped events are well forwarded and everything goes well until the next day in the morning.

I am using the RC1 of Filebeat and in the logs I can see the the last event published and then these lines till the end :

2015-11-18T02:02:07+01:00 DBG  Start next scan
2015-11-18T02:02:07+01:00 DBG  scan path d:/logs/application.log
2015-11-18T02:02:07+01:00 DBG  Check file for harvesting: d:/logs/application.log
2015-11-18T02:02:07+01:00 DBG  Update existing file for harvesting: d:/logs/application.log
2015-11-18T02:02:07+01:00 DBG  Not harvesting, file didn't change: d:/logs/application.log

There is no error in the log file.

I know that I do not have much info for troubleshoot this issue but if you have any idea about this issue or suggestions to collect more info, do not hesitate !

Some info about my Filebeat config :

filebeat:
  prospectors:
    -
      paths:
      - d:/logs/application.log
      encoding: iso-8859-1
      input_type: log
      ignore_older: 5m

  registry_file: d:/filebeat/filebeat.application.registry
  spool_size: 1

ruflin · November 18, 2015, 10:27am

@ogauchard Thanks for reporting this. It seems like this could be also related to: ERR File reading error. Stopping harvester. Error: EOF

So in your case even though additional lines are added to application.log, filebeat tells you that the file didn't change, correct?

So far I assume the issue could be related to a race condition with fast file rotation, but in your case this does not seem to be the case.

Some question that could help to nail down the problem:

Are there any specific characters when it stops reading?
Is the rotate file deleted when the reading stops?
Are you using a shared drive under windows?
What is the total number of events sent to ES when the logging stops?
You see the above lines: Start next scan, ... multiple times? Means until you kill filebeat?

ogauchard · November 18, 2015, 10:44am

Thanks for answering so quickly !

YES

Nothing special. This is a standard message that Filebeat correctly handles at other times.

I cannot answer. I have to check the Log4Net source code. I think it is the same mechanism as log4J.

NO

It depends. Sometimes about 40, sometimes 300.

Yes. These lines are repeated until I stop the windows service.

ogauchard · November 18, 2015, 11:03am

I just had a look at the log4net code. It seems to rename the current file (using move command) and then create a new current file.

ruflin · November 18, 2015, 12:35pm

@ogauchard Thanks a lot for the details. That helps. I will let you know when I have more details.

ruflin · November 19, 2015, 9:17am

@ogauchard We found a potential issue and fixed it. It would be nice if you could try our most recent build here to see if it fixed your issue: https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/

ogauchard · November 20, 2015, 9:13am

@ruflin I've just installed the latest build and started the services. I'll give you feedback on monday after a few days of processing. Thanks for your efficiency.

ruflin · November 20, 2015, 9:13am

Thanks.

ogauchard · November 23, 2015, 8:31am

Well it's a half a good news. I installed the latest build and restart Filebeat last Friday morning. It ran well and detected all changes till this morning at about 1 am. So it is really better but not totally fixed. As usual, I have restarted the windows service this morning and all events from 1 am to now were detected.

Here comes the logs of the last "good" scan :

2015-11-23T00:42:37+01:00 DBG Start next scan
2015-11-23T00:42:37+01:00 DBG scan path d:/logs/application.log
2015-11-23T00:42:37+01:00 DBG Check file for harvesting: d:/logs/application.log
2015-11-23T00:42:37+01:00 DBG Update existing file for harvesting: d:/logs/application.log
2015-11-23T00:42:37+01:00 DBG Not harvesting, file didn't change: d:/logs/application.log
2015-11-23T00:42:38+01:00 DBG Flushing spooler because of timemout. Events flushed: 0
2015-11-23T00:42:43+01:00 DBG Flushing spooler because of timemout. Events flushed: 0
2015-11-23T00:42:46+01:00 DBG Flushing spooler because spooler full. Events flushed: 1
2015-11-23T00:42:46+01:00 DBG Flushing spooler because spooler full. Events flushed: 1
2015-11-23T00:42:46+01:00 DBG send event
2015-11-23T00:42:46+01:00 DBG Start Preprocessing
2015-11-23T00:42:46+01:00 DBG Flushing spooler because spooler full. Events flushed: 1
2015-11-23T00:42:46+01:00 DBG Publish: {
...
}
2015-11-23T00:42:46+01:00 DBG Forward preprocessed events
2015-11-23T00:42:46+01:00 DBG output worker: publish 1 events
2015-11-23T00:42:46+01:00 DBG Try to publish %!s(int=1) events to logstash with window size %!s(int=1024)
2015-11-23T00:42:46+01:00 DBG %!s(int=1) events out of %!s(int=1) events sent to logstash. Continue sending ...
2015-11-23T00:42:46+01:00 INFO Events sent: 1
2015-11-23T00:42:46+01:00 DBG send event
2015-11-23T00:42:46+01:00 DBG Start Preprocessing
2015-11-23T00:42:46+01:00 DBG Processing 1 events
2015-11-23T00:42:46+01:00 DBG Flushing spooler because spooler full. Events flushed: 1
2015-11-23T00:42:46+01:00 DBG Write registry file: d:\Applis\ELK\data\filebeat\filebeat.histovds.webservice.registry
2015-11-23T00:42:46+01:00 DBG Publish: {
...
}

...

2015-11-23T00:42:46+01:00 DBG Forward preprocessed events
2015-11-23T00:42:46+01:00 DBG output worker: publish 1 events
2015-11-23T00:42:46+01:00 DBG Try to publish %!s(int=1) events to logstash with window size %!s(int=1024)
2015-11-23T00:42:46+01:00 DBG Processing 1 events
2015-11-23T00:42:46+01:00 DBG Write registry file: d:\Applis\ELK\data\filebeat\filebeat.histovds.webservice.registry
2015-11-23T00:42:46+01:00 INFO Registry file updated. 1 states written.
2015-11-23T00:42:46+01:00 DBG %!s(int=1) events out of %!s(int=1) events sent to logstash. Continue sending ...
2015-11-23T00:42:46+01:00 INFO Events sent: 1
2015-11-23T00:42:46+01:00 DBG Processing 1 events
2015-11-23T00:42:46+01:00 DBG Write registry file: d:\Applis\ELK\data\filebeat\filebeat.histovds.webservice.registry
2015-11-23T00:42:46+01:00 INFO Registry file updated. 1 states written.

And after the following sequence is repeated till I stop the service :

2015-11-23T00:42:47+01:00 DBG Start next scan
2015-11-23T00:42:47+01:00 DBG scan path d:/logs/application.log
2015-11-23T00:42:47+01:00 DBG Check file for harvesting: d:/logs/application.log
2015-11-23T00:42:47+01:00 DBG Update existing file for harvesting: d:/logs/application.log
2015-11-23T00:42:47+01:00 DBG Not harvesting, file didn't change: d:/logs/application.log
2015-11-23T00:42:53+01:00 DBG Flushing spooler because of timemout. Events flushed: 0

And there is also this line (which is expected) :

2015-11-23T00:47:51+01:00 ERR File reading error. Stopping harvester. Error: Stop harvesting as file is older then ignore_older: d:/logs/application.log; Last change was: 5m5.011712s

Our servers are VMs on a private cloud. I have to check whether any maintenance operation occurs this morning and could have prevent Filebeat to run correctly.

ruflin · November 23, 2015, 8:46am

If I understand it right, the file which is not picked up again is the application.log file. It seems to happen after the first time that the harvester closes the because of ignore_older. Afterwards it changes again, but it seems like filebeat still decides that the file hasn't changed, correct?

ogauchard · November 23, 2015, 9:27am

This is unfortunately not so simple. I have checked the logs and I can see that harvester has already been closed in previous days due to ignore_older but following changes have been detected.

And I cannot see any abnormal activity in cloud cluster.

ruflin · November 23, 2015, 9:45am

Strange. Could you check the next time the harvester is not picking up the file anymore, what the file stats are (create date, last modified)?

ogauchard · November 23, 2015, 9:56am

OK

I will see tomorrow morning if everything is fine or not. And if it is not I will give you these informations.

It is very strange. Everything was fine on saturday and sunday but the behaviour I encountered before the last build occured this morning.

Just for information, what have you fixed in last build ?

ruflin · November 23, 2015, 10:20am

@ogauchard Which two builds are you comparing? rc2 and nightly from friday?

ogauchard · November 23, 2015, 10:32am

@ruflin rc2 and the build from thursday at 15:33

ruflin · November 23, 2015, 4:14pm

@ogauchard There is for the moment probably a quite simple solution. Increase ignore_older to 24h or at least higher then the longest update frequency of your files. Can you try that out?

ruflin · November 23, 2015, 4:16pm

@ogauchard There were two bugs we fixed. Here in the changelog is the description: https://github.com/elastic/filebeat/blob/master/CHANGELOG.md The full code changes can be seen here: https://github.com/elastic/filebeat/compare/1.0.0-rc2...1.0.0

ogauchard · November 23, 2015, 4:17pm

@ruflin Won't large ignore_older configuration be a problem with file rotation ?

ruflin · November 23, 2015, 4:18pm

@ogauchard As long as you don't have a very high file rotation rate with thousands of files every 24 hours, it shouldn't be.

ogauchard · November 23, 2015, 4:21pm

@ruflin OK I let the ignore_older to 5 minutes to see what happens tomorrow morning. If it does not work, I'll set it to 24h.

Topic		Replies	Views
Filebeat not seeing fileupdates Beats	3	697	July 5, 2017
Filebeat stop harvesting Beats filebeat	6	1837	April 8, 2020
Filebeat for logs that send once every couple days Beats filebeat	4	927	December 25, 2019
Filebeat doesn't read file after rotation Beats filebeat	1	340	May 14, 2019
Filebeat is unable to read file for 4 hours every day Beats filebeat	3	946	April 20, 2017

Filebeat do not see file updates after a while

Related topics