Filebeat configuration's specific value

_kyllr · July 27, 2018, 11:26am

Can someone please help me understand the flow of this specific filebeat configuration?

harvester_limit: 15000
scan_frequency: 1h
close_inactive: 3h

Thank you!

shaunak · July 27, 2018, 12:48pm

Hi @_kyllr,

These settings are documented in the filebeat.reference.yml file. You can find this file in your installation as well as online: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-reference-yml.html.

I've copied the documentation for the three settings you posted below, and tried to offer a bit of extra explanation of my own as well, in case it helps.

# Max number of harvesters that are started in parallel.
# Default is 0 which means unlimited
harvester_limit: 15000

In Filebeat you can specify one or more inputs. Each input may start one or more harvesters depending on how many sources (log file paths, or something else, depending on the type of input) the input has to watch for new logs. The harvester_limit setting determines the maximum number of such harvesters that are allowed to run at any given time.

# How often the input checks for new files in the paths that are specified
# for harvesting. Specify 1s to scan the directory as frequently as possible
# without causing Filebeat to scan too frequently. Default: 10s.
scan_frequency: 1h

As mentioned above, Filebeat starts up one or more harvesters for each input, depending on the sources (e.g. log file paths) for that input. But first Filebeat has to discover the sources for an input and also check if a previously-discovered source is still available. The scan_frequency setting determines how often Filebeat tries to discover new sources for the input and checks the availability of previously-discovered sources.

# Close inactive closes the file handler after the predefined period.
# The period starts when the last line of the file was, not the file ModTime.
# Time strings like 2h (2 hours), 5m (5 minutes) can be used.
close_inactive: 3h

When Filebeat discovers a source for an input and creates a harvester for it, it also has to allocate some system resources (such as a file handle if the source is a log file). In order to not run out of such system resources over time, Filebeat looks for opportunities to free out these resources. For example, a log file may no longer be getting appended to any more so Filebeat can close the harvester and underlying system resource (file handle) for this file. But how can Filebeat decide when a file is no longer being appended to? How long should Filebeat wait looking for new data to appear in the source before it decides that the source is no longer being appended to? The close_inactive setting determines answers this "how long" question for Filebeat.

_kyllr · July 30, 2018, 6:58am

Hi @shaunak

Thanks for your response. But how about close_eof ?

system · August 27, 2018, 6:59am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat scan frequency Beats filebeat	4	4644	November 16, 2020
Scan frequency working Beats filebeat	1	726	June 4, 2021
Tuning for harvesting a large number of files Beats filebeat	1	657	September 3, 2019
Harvester_limit over directory with more 1M files Beats filebeat	3	6316	May 16, 2018
Filebeat configuration Beats filebeat	3	415	September 13, 2018

Filebeat configuration's specific value

Related topics