Hello,
I have been trying to get Logstash to be able to process as many IPFIX (Netflow v10) packets as possible. I have seen some cases where Logstash users easily reached 40k events per second, or even up to 90k:
I have tried many combinations of the settings: flush_size, workers (input workers), queue_size, options in logstash.yml and sysctl.conf parameters. My current Logstash configuration is as following, where I can process approximately 5k events per second.
input
{
udp
{
port => 9995
codec => netflow
{
versions => [10]
target => ipfix
}
type => "ipfix"
queue_size => 15000
workers => 4
}
}
filter
{
metrics
{
meter => "events"
add_tag => "metric"
}
}
output
{
if "metric" in [tags]
{
file
{
path => "/var/log/logstash/metrics.log"
codec => line
{
format => "rate: %{[events][rate_1m]}"
}
}
}
elasticsearch
{
hosts => [ "host1:9200"
, "host2:9200"
, "host3:9200" ]
index => "ipfix-%{+YYYY.MM.dd}"
flush_size => 500
}
}
I run Logstash instances on VMWare ESX machines, as virtual machines with Ubuntu server 16.10, 8 cores each and 8 GB RAM. The Logstash heap size is set to min: 2gb and max: 4gb. The virtual machines are connected by 10GBit/s fiber and VMXNET3.
When I increase the number of flows per second I start to notice packet drops:
root@logstash:/var/log/logstash# netstat -su | grep errors
85729629 packet receive errors
I did a lot of searching and tuning in kernel parameters and ethtool commands, but I cannot get around them. I have tried many things over a long time. What can I possibly be doing wrong?
Thanks a lot in advance!
-Gijs