Registry 60MB and increasing causes high cpu and stops sending events

Zfs · June 29, 2016, 3:35pm

Hi,

I have many servers using filebeat 1.2.3 on windows. It's monitoring about 16 directories with rotating logs in each. Some logs can be rotated within a few seconds others can take longer. The issue I face is that the filebeat process starts to consume 100% cpu but doesn't actually start shipping any new files.

close_older: 30m
ignore_older: 1h

Running with -v -e -d "*" it appears to be stuck on the prospector checking files over and over and not actually shipping anything.

I have to delete the registry to return everything to normal.

ruflin · July 4, 2016, 7:04am

How many files do you have in the directories in total?

Zfs · July 4, 2016, 1:46pm

Hi,

984 files currently, older files are removed after 48 hours. On a day it can create 800 new log files.

ruflin · July 5, 2016, 10:37am

We are currently introducing some new options to clean up the registry file so it doesn't grow over time. I would assume clean_older could help in your case: https://github.com/elastic/beats/issues/1600#issuecomment-228670068 Any chance for you to try out the nightly builds and see if it helps?

Zfs · July 5, 2016, 10:54am

Great, I'll test it on a few servers.

Thanks

Zfs · July 7, 2016, 7:44pm

The registry file seems a lot better, I've however noticed that filebeat process just dies after a day.

This is logged in the Windows Event Viewer

The filebeat service terminated unexpectedly.  It has done this 2 time(s).

The log file ends with:

2016-07-07T13:00:42-04:00 INFO Run prospector
2016-07-07T13:00:46-04:00 INFO Events sent: 40
2016-07-07T13:00:46-04:00 INFO Registry file updated. 892 states written.

I currently have filebeat sending to both logstash and kafka. I'll remove the kafka output and see if the problem persists.

ruflin · July 11, 2016, 9:48am

@Zfs That is strange. As there are not "shut down" messages or panics it seems like the filebeat got killed. Thanks for trying without kafka.

@steffens Pinging you in case kafka could be the issue

steffens · July 11, 2016, 3:17pm

yeah, super strange. Can you run filebeat with debug log enabled?

Zfs · July 11, 2016, 5:50pm

With kafka turned off, the problem reoccurred. I'll enable debugging.

I have the following extra parameters for file cleaning etc:

ignore_older: 1h
close_older: 30m
clean_older: 36h
clean_removed: 24h

From my last log. I'll see if Debug shows more info.

2016-07-11T13:51:32-04:00 INFO Read line error: No more bytes
2016-07-11T13:51:32-04:00 INFO Crawler stopped
2016-07-11T13:51:32-04:00 INFO Stopping spooler
2016-07-11T13:51:32-04:00 INFO Stopping Registrar
2016-07-11T13:51:32-04:00 INFO Ending Registrar
2016-07-11T13:51:32-04:00 INFO Registry file updated. 4410 states written.
2016-07-11T13:51:32-04:00 INFO filebeat cleanup

ruflin · July 12, 2016, 9:56am

@Zfs The above log message is somehow strange is it seems to show a normal filebeat shut down. Could it be that you have any security tool on your machine that kills long running processes? Is there some pattern on when this happens?

Zfs · July 12, 2016, 10:22am

Unfortunately no, I had filebeat 1.2 running for much longer on this same server without a restart. I still have 1.2 running on about 50 servers without any issue other than having to cleanup the registry manually when it becomes too big.

I'll continue to change my config and see if any thing changes.

As a workaround I could have the service restart itself on all failures but I'll hold off on that for now.

Zfs · July 12, 2016, 12:21pm

Not much info from debug logs it just ends, now I'm wondering if I actually uploaded the correct log yesterday. I'll test a few other options.

2016-07-12T02:19:16-04:00 DBG  Prospector states cleaned up.
2016-07-12T02:19:16-04:00 DBG  End of file reached: C:\Progra~1\xxxxx-3.07.12.2016-02.19.09.log; Backoff now.
2016-07-12T02:19:16-04:00 DBG  End of file reached: C:\Progra~1\xxxxx-22.07.12.2016-02.18.47.log; Backoff now.
2016-07-12T02:19:16-04:00 DBG  try read: 6
2016-07-12T02:19:16-04:00 DBG  End of file reached: C:\Progra~1\xxxx-10.07.12.2016-01.26.38.log; Backoff now.
2016-07-12T02:19:16-04:00 DBG  End of file reached: C:\Progra~1\xxxx-15.07.12.2016-01.35.02.log; Backoff now.
2016-07-12T02:19:16-04:00 DBG  End of file reached: C:\Progra~1\xxxx-18.07.12.2016-02.19.07.log; Backoff now.
2016-07-12T02:19:16-04:00 DBG  Drop line as it does match one of the exclude patterns07/12/16 02:19:05.941 INFO: [xxxxx] 
2016-07-12T02:19:16-04:00 DBG  End of file reached: C:\Progra~1\xxxx-9.07.12.2016-02.18.22.log; Backoff now.
2016-07-12T02:19:17-04:00 DBG  End of file reached: C:\Progra~1\xxxx-3.07.12.2016-02.19.09.log; Backoff now.
2016-07-12T02:19:17-04:00 DBG  End of file reached: C:\Progra~1\xxxx-20.07.12.2016-01.31.38.log; Backoff now.
2016-07-12T02:19:17-04:00 DBG  Drop line as it does match one of the exclude patterns07/12/16 02:19:11.764 INFO: [xxxx] 
2016-07-12T02:19:17-04:00 DBG  End of file reached: C:\Progra~1\xxx-17.07.12.2016-02.19.02.log; Backoff now.
2016-07-12T02:19:17-04:00 DBG  try read: 6
2016-07-12T02:19:17-04:00 DBG  End of file reached: C:\Progra~1\x-27.07.12.2016-01.31.19.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\xxx-12.07.12.2016-01.50.46.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\x-22.07.12.2016-02.18.47.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\x-6.07.12.2016-01.19.30.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\x-5.07.12.2016-02.18.02.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\x-16.07.12.2016-01.31.02.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\x-18.07.12.2016-02.19.07.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\x-24.07.12.2016-01.35.14.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\x-17.07.12.2016-02.19.02.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\x-19.07.11.2016-16.13.45.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  try read: 6
2016-07-12T02:19:18-04:00 DBG  End of file reached: C:\Progra~1\x-3.07.12.2016-01.39.39.log; Backoff now.
2016-07-12T02:19:18-04:00 DBG  Flushing spooler because of timeout. Events flushed: 658
2016-07-12T02:19:19-04:00 DBG  End of file reached: C:\Progra~1\x-3.07.12.2016-02.19.09.log; Backoff now.
2016-07-12T02:19:19-04:00 DBG  End of file reached: C:\Progra~1\x-11.07.12.2016-01.27.05.log; Backoff now.

system · July 20, 2016, 3:35pm

This topic was automatically closed after 21 days. New replies are no longer allowed.

Topic		Replies	Views
Registry is Never Cleaned Beats filebeat	6	502	April 23, 2018
Filebeat Registry Issue handling large number of files Beats filebeat	1	243	May 5, 2022
Filebeat slows down as registry file grows Beats filebeat	3	981	October 25, 2021
Filebeat stops processing files after 1 or 2 days issue Beats filebeat	5	699	November 21, 2016
Filebeat on Windows seem to not use the registry file Beats filebeat	13	4009	April 17, 2017

Registry 60MB and increasing causes high cpu and stops sending events

Related topics