Netflow High CPU utilization and poor throughput


(Jason) #1

I've got an issue with Logstash 5.x on an RHEL7 VM (8 vCPU, 2GB memory) on vSphere 6 hosts with E5-2698 v4 Xeon CPUs.

I've installed the filter plugin and the Netflow plugin - basic gist is that I'm ingesting Netflow and using the filter plugin to enrich the Netflow with one additional bit of information before sending it to Elasticsearch 5.x (instead of just having a number for IPPROTOCOL, putting in a name as well).

I've got one firewall sending Netflow to it (data rate is about 6.5Mbit worth of Netflow traffic), and despite my best efforts Logstash is continually dropping UDP packets and CPU usage is incredibly high. No settings adjustments that I've made have managed to make things much better. Below are some configuration snippets and show outputs. I should note that Elasticsearch is practically idle, with maybe 23% CPU usage on a single vCPU (machine has 8x vCPU).

Any assistance is greatly appreciated.

netstat -suna

Udp:
267034 packets received
147276 packets to unknown port received.
254987 packet receive errors
130 packets sent
0 receive buffer errors
0 send buffer errors

netstat -neopa | grep udp

udp 202368 0 0.0.0.0:2055 0.0.0.0:* 995 35399 2813/java off (0.00/0/0)

curl -XGET http://localhost:9600/_node/hot_threads?pretty=true&threads=4

[1] 3144

{

"host" : "netflow1.blah.com",
"version" : "5.5.1",
"http_address" : "127.0.0.1:9600",
"id" : "1fc0627e-1a59-4e72-9bde-2e7aed695a03",
"name" : "netflow1.blah.com",
"hot_threads" : {
"time" : "2017-07-27T13:54:50-05:00",
"busiest_threads" : 3,
"threads" : [ {
"name" : "Ruby-0-Thread-3",
"percent_of_cpu_time" : 0.01,
"state" : "timed_waiting",
"path" : "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.22/lib/stud/task.rb:22",
"traces" : [ "java.lang.Object.wait(Native Method)", "org.jruby.RubyThread.sleep(RubyThread.java:1002)", "org.jruby.RubyKernel.sleep(RubyKernel.java:803)" ]
}, {
"name" : "<udp.0",
"percent_of_cpu_time" : 71.66,
"state" : "runnable",
"traces" : [ "java.lang.Throwable.getStackTraceElement(Native Method)", "java.lang.Throwable.getOurStackTrace(Throwable.java:827)", "java.lang.Throwable.getStackTrace(Throwable.java:816)" ]
}, {
"name" : "Ruby-0-Thread-25",
"percent_of_cpu_time" : 0.0,
"state" : "runnable",
"path" : "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/reactor.rb:136",
"traces" : [ "sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)", "sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)", "sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)" ]
} ]
}
}

cat /etc/logstash/jvm.options

-Xms1024m
-Xmx1536m

cat /etc/logstash/conf.d/output.conf

output {
elasticsearch {
hosts => ["elasticsearch.blah.com"]
index => "%{type}-%{+YYYY.MM.dd}"
}
}

cat /etc/logstash/logstash.yml

pipeline.batch.size: 50000


(Jason) #2

cat /etc/logstash/conf.d/netflow.conf

input {
udp {
host => "0.0.0.0"
port => 2055
codec => netflow { versions => [5, 9, 10] netflow_definitions => "/etc/logstash/netflow.yaml" }
type => "netflow"
workers => 8
queue_size => 8000
}
}

filter {

translate {
field => "[netflow][protocol]"
destination => "[netflow][protocolname]"
dictionary => [
"0","HOPOPT",
"1","ICMP",
"2","IGMP",
"3","GGP",
"4","IP-in-IP",
"5","ST",
"6","TCP",
"7","CBT",
"8","EGP",
"9","IGP",
"10","BBN-RCC-MON",
"11","NVP-II",
"12","PUP",
"13","ARGUS",
"14","EMCON",
"15","XNET",
"16","CHAOS",
"17","UDP",
"18","MUX",
"19","DCN-MEAS",
"20","HMP",
"21","PRM",
"22","XNS-IDP",
"23","TRUNK-1",
"24","TRUNK-2",
"25","LEAF-1",
"26","LEAF-2",
"27","RDP",
"28","IRTP",
"29","ISO-TP4",
"30","NETBLT",
"31","MFE-NSP",
"32","MERIT-INP",
"33","DCCP",
"34","3PC",
"35","IDPR",
"36","XTP",
"37","DDP",
"38","IDPR-CMTP",
"39","TP++",
"40","IL",
"41","IPv6",
"42","SDRP",
"43","IPv6-Route",
"44","IPv6-Frag",
"45","IDRP",
"46","RSVP",
"47","GRE",
"48","DSR",
"49","BNA",
"50","ESP",
"51","AH",
"52","I-NLSP",
"53","SWIPE",
"54","NARP",
"55","MOBILE",
"56","TLSP",
"57","SKIP",
"58","IPv6-ICMP",
"59","IPv6-NoNxt",
"60","IPv6-Opts",
"62","CFTP",
"64","SAT-EXPAK",
"65","KRYPTOLAN",
"66","RVD",
"67","IPPC",
"69","SAT-MON",
"70","VISA",
"71","IPCU",
"72","CPNX",
"73","CPHB",
"74","WSN",
"75","PVP",
"76","BR-SAT-MON",
"77","SUN-ND",
"78","WB-MON",
"79","WB-EXPAK",
"80","ISO-IP",
"81","VMTP",
"82","SECURE-VMTP",
"83","VINES",
"84","TTP",
"85","IPTM",
"86","NSFNET-IGP",
"87","DGP",
"88","TCF",
"89","EIGRP",
"90","OSPF",
"91","Sprite-RPC",
"92","LARP",
"93","MTP",
"94","AX.25",
"95","OS",
"96","MICP",
"97","SCC-SP",
"98","ETHERIP",
"99","ENCAP",
"101","GMTP",
"102","IFMP",
"103","PNNI",
"104","PIM",
"105","ARIS",
"106","SCPS",
"107","QNX",
"108","A/N",
"109","IPComp",
"110","SNP",
"111","Compaq-Peer",
"112","IPX-in-IP",
"113","VRRP",
"114","PGM",
"116","L2TP",
"117","DDX",
"118","IATP",
"119","STP",
"120","SRP",
"121","UTI",
"122","SMP",
"123","SM",
"124","PTP",
"125","IS-IS over IPv4",
"126","FIRE",
"127","CRTP",
"128","CRUDP",
"129","SSCOPMCE",
"130","IPLT",
"131","SPS",
"132","PIPE",
"133","SCTP",
"134","FC",
"135","RSVP-E2E-IGNORE",
"136","Mobility Header",
"137","UDPLite",
"138","MPLS-in-IP",
"139","manet",
"140","HIP",
"141","Shim6",
"142","WESP",
"143","ROHC"
]
}
}


(Jason) #3

top - 14:06:10 up 33 min, 1 user, load average: 8.16, 7.96, 6.81
Tasks: 193 total, 1 running, 192 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.4 sy, 78.7 ni, 20.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1867280 total, 71224 free, 1406760 used, 389296 buff/cache
KiB Swap: 1047548 total, 1047548 free, 0 used. 256428 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2813 logstash 39 19 7298132 1.172g 15884 S 631.9 65.8 157:29.29 java


(Jorrit Folmer CISSP) #4

This issue is tracked here: https://github.com/logstash-plugins/logstash-codec-netflow/issues/85


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.