This is a warning not an error and the frequency at which it is logged can be controlled with an ENV setting called FILEWATCH_MAX_FILES_WARN_INTERVAL - the default is 20 seconds. You can make this much higher once you understand the rest of this reply.
the limit can be increased by setting the max_open_files option in the plugin config.
You should not do this without understanding the internal operation because the input uses this for protection against opening and reading too many files at the same time.
Before this protection was introduced, if for example a file glob pattern yielded 1 million files then the input would open all 1 million files (or try to) and run out of file handles and this meant that the other filters or outputs could not open files or network connections.
You should see the max_open_files value as a sliding window of currently open files. Files are "discovered" via the glob pattern. In discovery they are not read yet, rather an internal entry is created and the files inode and size is checked against the sincedb (file based records of previous activity on an inode) if the file is new or its size has changed (up or down) the internal entry is marked for reading. Based on max_open_files, N number of files are added to the set of files to be read (the window). The process begins by reading a (up to) 32k chunk from each file and giving lines from the chunk to the file input line processing stages. When the chunk is seen to have been processed the sincedb is updated. If there are no more chunks and the close_older time for that file is reached the file is closed and it is removed from the window set (but not the discovered files set) - this frees up slots in the window set and so enabled the waiting files to be read.
Whether this is a problem or not depends on whether you are tailing files that are being actively written to or whether you are attempting to ingest complete files (where a repeated stat on the file gives the same size as the first stat).
Tailing:
Usually the tailed files increase in size in smallish jumps and in order to responsively process these increases, all the tailed files should be in the window set. This means you should increase the max_open_files to be big enough for all the files you are tailing. It also means that you should consider partitioning the files so that different Logstash instances can process these files in parallel.
Read once (not tailing):
Here the file, not seen before, is now seen to have increased in one large (and sometimes very large) jump (megabytes). These files are put in the window set and must be read in full before they are become eligible for the removal (via close_older) from the window set. Note: eligibility is checked at the end of the window set processing loop. This means that for very large files a small max_open_files (window) and a small close_older setting is advised. It will mean that the file input is more responsive to moving through the set of discovered files and importantly, the shutdown signal.
Further if you are using the multiline codec, a smaller window translates to a smaller working set of multiline buffers (memory usage).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.