I have a logstash configuration to ingest large number of json files from a directory.
The ingestion works perfectly fine, But for a particular number of json files.
For example i pointed the input for a folder with 400000 json files and it gets ingested without a problem. But when I point the input for a folder with 1000000 json files, it does not ingest.
I believe 7.10.2 would have a 4.2.x file plugin. You could verify using
cd /usr/share/logstash; bin/logstash-plugin list --verbose logstash-input-file
I suggest you point logstash at a folder with a million json files, wait a while, then get a thread dump (kill -3, or jstack, or whatever tool you prefer) and what the runnable threads are doing. If you are unable to interpret the thread dump then post a gist or put it on on some other site where I can view the text and I will see if I can tell what it is doing.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.