I'm trying to improve the performance of Logstash when ingesting about 22,500 Netflow records/sec.
I have read the general performance tuning tips here
Is there any way to turn off the geoip plugin? Of the 22,500 flows/sec hitting Logstash, I only see about 7,000-8,000 hit Elastic. I assume the rest are being dropped as the CPU (as reported by dstat) runs at about 95%. Java is the top CPU consumer.
I also thought about scaling Logstash horizontally, so reviewed the information here, but this doesn't mention how that might be achieved - HAProxy seems to be out of the question as it doesn't support UDP. Seems like iptables could do it. Is there a best-practice from the Elastic community anyone could point me to?
Also some general guidance on CPU/memory requirements for logstash to process 25,000 flows/sec - seems my current VM runs out of puff at 8,000 flows with 6x vCPUs and 16GB memory. On that basis, do I really need 3.1x the current CPUs, or 50x vCPUs to handle this load?!