Filebeat harvester picks up files in random order when scanning

menezeswayne · May 20, 2017, 11:36pm

I used the harvester_limit: 1 option along with close_inactive option to close a file after 1 min. This works for my use case. However when there are multiple older log files to be harvested, filebeat picks up the files in a random order instead of going for the oldest file first.

Has anyone faced this issue?

warkolm · May 22, 2017, 12:33am

I'm not sure we provide explicit assurances around the order of processing.
It might be worth raising this as a feature request if it's important to you

menezeswayne · May 22, 2017, 1:14am

I have a solution in my fork. If I raise a pull request do you mind taking
a look?

ruflin · May 22, 2017, 1:15pm

Indeed we don't have any guarantees on the ordering. We discussed it in the past but didn't see the need for it as it would add scheduling complexity (as everyone wants a different order). Can you elaborate in more detail on why you need this ordering and have harvester_limit: 1. Understanding the use cases helps a lot.

Happy to also have a look at some code.

menezeswayne · May 22, 2017, 10:11pm

We have a legacy system that produces a lot of logs which rollover frequently. Our setup sends logs to logstash from where we use the http output and send it to another legacy monitoring service. It has wireless connectivity issues which means we could see gaps during which no logs get shipped over to logstash. When the connection comes back we'd like to see the log files get sent out in the same order they were created. We can tolerate some amount of re-ordering in the legacy monitoring service but in the case of filebeat the order of scan is just totally random which doesn't help at all.

menezeswayne · May 22, 2017, 10:14pm

pull request here: https://github.com/elastic/beats/pull/4374

menezeswayne · May 24, 2017, 1:18am

@ruflin I have another question - how do I ensure that while cleaning up older log files that their offset isn't still tracked in the registry. I assume that when close_inactive is reached, it gets removed from the registry?

ruflin · May 26, 2017, 7:02am

Check the clean_* options like clean_inactive.

system · June 23, 2017, 7:02am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat does not poll files in order Beats filebeat	5	2284	November 21, 2017
fileBeat isn't harvesting the logs from the last path's Beats filebeat	9	3680	December 27, 2017
Filebeat to Logstash : How to keep lines order Beats filebeat	7	3201	November 28, 2018
Filebeat fails to pickup other files Beats filebeat	24	4272	July 5, 2017
Filebeat harvesters don't restart after stopping due to ignore_older Beats filebeat	5	5089	July 5, 2017

Filebeat harvester picks up files in random order when scanning

Related topics