Almost 80% of data is lost when i use UDP input plugin for netflow data. Below is my configuration file.
input {
udp {
queue_size => 50000
port => 9993
type => "netflow"
workers => 4
codec => netflow { versions => [5] }
}
}
not sure where i am going wrong. previously i was using default values for all of plugin properties. After increasing some of the buffer sizes saw some improvement.
logstash in running on 4 core machine and all the 4 cores are showing as 80-90% used.
Is the kafka output slowing down and causing the packets to be dropped from the buffers ? or UDP input configuration is wrong ? I am using logstash 1.5.0
Thanks warkolm, we had run tcpDump tool to capture the UDP traffic on that machine. when we compared the tcpdump collected data and logstash collected data we came to know about this data loss. we used logstash file output for this testing. is there any other way to identify where actually we are missing the data.
That is around 240k messages per second. That sounds like a lot for a single Logstass instance to handle. If you are successfully only capturing only 20% of these events, you are likely to need to spread the load across a larger number of Logstass instances.
yep Christian, i thought about the same but we are currently listening on a port for the UDP traffic..
how can i share it with 2 different logstash instances ?
I am not sure you can have multiple Logstass instances listening to the same port on a single host, but even if you could you might be limited by the resources of the server. What does resource usage look like on the host when you are collecting traffic? Is there anything limiting throughput, e.g. CPU?
You might be able to scale out to multiple instances by using a loadbalancer able to handle UDP or postbly even by setting up DNS round robin.
If it uses that amount of CPU for processing 20% of the traffic, you will need to get a host with more CPU (as that seems to be the limiting factor) or scale out.
It is quite possible that the Kafka output plugin is limiting throughput to some extent, but I am not sure exchanging it for some other output plugin would improve performance. Given the gap between the current throughput level and what is required, you will need to scale up and/or out.
Set your Kafka workers to 1 and see if that helps. You aren't going to get
any more performance by having it larger than 1 due to the parallelism of
Logstash. Using async mode will definitely drop messages if the buffer is
slow. I'd also set queue.buffering.max.ms much higher, like 5000 as that is
going to chew through CPU and could affect your throughput (too many small
batches going out). Set your batch.num.messages to ~ 1/50th of your max so
1000 to balance out the queue buffer max ms being higher.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.