Approach for receiving large amount of messages per second

Hey,

I try to set up logstash to receive a large number of UDP messages per second, up to may be 50k. Currently, for my test setup, logstash and elasticsearch is installed on my local machine (i5-4590S, 16gb RAM) and gets messages from 3 different VMs on 3 different ports through UDP messages.

The UDP messages contain one of 6 different log messages which get parsed by one of 6 config files with the use of grok patterns.

I start logstash with -w 16 and -b 1564. However, the messages I receive in elasticsearch cap out at 5.6k messages/second without any filters applied. With filters, they capout at about 4.8k. CPU usage on all cores is around 50 to 80% and memory usage is only 4.3gb out of my 16gb.

What do you guys think would be the best approach to handle that many messages/second with ELK?

You're only talking about the Logstash part of the equation here. How many data nodes do you have in your Elasticsearch instance? Elasticsearch was meant to scale horizontally. More nodes = faster data ingest. You may just be saturating your existing n node cluster's ingest rate. Ways to improve this without adding nodes would be to increase disk I/O by using SSDs, or RAID-0 SSDs. At the end of the day, though, nothing will replace horizontal scaling.

I usually running a single node in my ES instance which I gave up to 8gb of ram. But I was trying with up to 4 additional nodes and varying memory allocations. Best performance so far was with that single node and 8g though.
I will test horizontal scaling soon but as I currently only work on my local machine, I thought it would be nice to see how far it could get and how to approach that target in the best way.

Referring to the kopf plugin, the avg load is on the limit with its 1 minute average of 4.55. CPU disk and heap in kopf is pretty low though.