I have a very huge and old log gathering system which is based on daily cronjobs and HDFS.
This system is not good for real-time analysis so I decided to introduce Logstash.
My system produces 850GB of log each day.
It means I have to process almost 100K lines per second for average, and it even goes over 300K+ in workhour.
I built a test bench with decent H/W and tested how much log it can handle. (FileBeat -> Logstash -> WebHDFS)
With lots of adjustment and tries, 50K eps was the best result I've made on the test environment.
I found it goes nearly 100K with Generator Input Plugin and File Output Plugin but not a real-world scenario.
Is there any use case performing 300K+ eps with Logstash?