From the harvesters point of view it's an error, because it cannot be started due to the configured limit. From users' point of view I would say it's rather a warning. You can choose to act on it and increase the harvester_limit if required. But if it's on purpose it might be a bit annoying to see Filebeat send these error messages.
Files which cannot have a harvester due to the limit is not read. By specifying a limit you tell Filebeat to read at most 400,000 files in parallel. If one of those files is read completely and closed a new harvester can be started for a new file. Filebeat scans the directory for unread files periodically. (So the answer is to your second question is yes.) The frequency can be set using scan_frequency. See more on this option: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html#scan-frequency
In theory I can imagine a situation when you have 400,000 log files which is updated all the time and Filebeat cannot keep up with and those files are never closed, so the other 600,000 log files can never be read. But in real life I think the log flow is not that fast.
Also, to avoid keeping log files open for too long, you can set close_inactive. See more on this option here: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html#close-inactive
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.