File Input from directory with 100K files

surphix · November 4, 2020, 2:03am

Hey first time posting here and looking for some understandings.

I have a logstash config that is trying to read a directory containing over 100,000 files. I've ran trace logs and even with sincedb_path set to /dev/null none of the files get processed. After every file I see sincedbcollection - associate: unmatched.

Is there a limit to how much logstash can handle without sincedb?

unfortunately I cannot share any configs or logs as it is work related

Wolfram_Haussig · November 4, 2020, 5:56am

Hi,

If you cannot share the configs it is hard to help you: Can you anonymize the fields which contain paths and other data which might point to a company or are otherwise confidential?
Are the files written completely or are they still written to?

You might want to check the following settings:

mode
start_position should be set to beginning if mode is either unset or explicitly set to tail to read all data instead of only ingesting new data.
ignore_older - maybe this setting forces LogStash to ignore your files?
path - have you checked that the path is correct and LogStash is able to read the files?

Best regards
Wolfram

surphix · November 4, 2020, 7:05am

Thanks for the reply. The current mode is set to read, start_position is beginning, I've tried ignore_older, but it does not work. I know the path is correct as with a smaller subset of data it works with no issues. I am using an XML filter, not sure if this would be the bottleneck?

Wolfram_Haussig · November 4, 2020, 7:11am

Do you have monitoring enabled for logStash? In this case you can check the throughput in Kibana under Stack Monitoring->Pipelines->your pipeline which looks like this:

If a filter limits the performance you would see that the input plugin would have more events emitted per second than the XML filter.

surphix · November 4, 2020, 2:40pm

Unfortunately I don't have stack monitoring on the pipelines.

system · December 2, 2020, 2:40pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash not reading files Logstash	3	2065	July 6, 2017
Logstash not reading more than 2 files Logstash	7	1413	August 31, 2017
Logstash file input not working Logstash	8	8107	May 19, 2017
Logstash is only partially reading input files Logstash	5	1885	August 7, 2017
Logstash file read basic question Logstash	3	751	July 6, 2017

File Input from directory with 100K files

Related topics