Harvester_limit over directory with more 1M files

hello guys,
I have a directory with around 1M files, so I tried added in the filebeat.yml a harvester_limit:

filebeat.prospectors:
- type: log
  enabled: true
  paths:
    - /logs/**
  harvester_limit: 400000

But filebeat trying harvester all files in the directory, so I have errors when harvester over 400000:

ERROR log/prospector.go:437 Harvester could not be started on new file:file-1.log, Err: Harvester limit reached

So, I have some questions:

1- This is a really error?
2- What happen with these files with error harvester limit?, filebeat will try harvester these files again?

Thank you.

From the harvesters point of view it's an error, because it cannot be started due to the configured limit. From users' point of view I would say it's rather a warning. You can choose to act on it and increase the harvester_limit if required. But if it's on purpose it might be a bit annoying to see Filebeat send these error messages.

Files which cannot have a harvester due to the limit is not read. By specifying a limit you tell Filebeat to read at most 400,000 files in parallel. If one of those files is read completely and closed a new harvester can be started for a new file. Filebeat scans the directory for unread files periodically. (So the answer is to your second question is yes.) The frequency can be set using scan_frequency. See more on this option: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html#scan-frequency
In theory I can imagine a situation when you have 400,000 log files which is updated all the time and Filebeat cannot keep up with and those files are never closed, so the other 600,000 log files can never be read. But in real life I think the log flow is not that fast.
Also, to avoid keeping log files open for too long, you can set close_inactive. See more on this option here: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html#close-inactive

1 Like

Understood, thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.