Use case : DNS traffic sent to logstash and then Easticsearch / kibana.
Packetbeat is installed as a service on a dedicated server.
I am using tcpreplay to send 1 gig of DNS trafic (for testing purpose )
When sending data, packetbeat uses about 70% of memory (2 Gigs of RAM on the VM) ...
Once the traffic hase been sent, memory consumption is not decreasing ....
And sometimes I can get packetbeat service killed by OS because too much memory used:
Please find a system log (/var/log/messages) extract
Dec 15 16:15:44 v-dns5 kernel: packetbeat invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0, oom_score_adj=0
Dec 15 16:15:44 v-dns5 kernel: packetbeat cpuset=/ mems_allowed=0
Dec 15 16:15:44 v-dns5 kernel: Pid: 28466, comm: packetbeat Not tainted 2.6.32-358.6.2.el6.x86_64 #1
..
..
Dec 15 16:15:44 v-dns5 kernel: Out of memory: Kill process 28429 (packetbeat) score 768 or sacrifice child
Dec 15 16:15:44 v-dns5 kernel: Killed process 28429, UID 0, (packetbeat) total-vm:2898568kB, anon-rss:1500980kB, file-rss:4912kB
Dec 15 16:15:44 v-dns5 kernel: device eth1 left promiscuous mode
I am far for being an expert in those domains, so do not hesitate to ask me to provide more details by giving me the associated command lines (if needed).
2 GB of RAM for 1 Gbps of DNS traffic doesn't sound that unreasonable to me, so you might just need more RAM for your usecase.
For some background, Packetbeat stores the requests until the response for the same transaction is seen. Because you used tcpreplay, it's likely that there were some packet drops or packet reordering on the network which tend to make it worse because if we miss the response, the request is kept for 10s. This timeout threshold is configurable, see here so you could try reducing that to see if it helps.
Please also check the interface stats (simply with ifconfig) to check if the kernel reports any dropped packets.
The fact that Packetbeat doesn't release memory after you stop the traffic is normal, I think Go frees up memory only in case of pressure from the OS, if even then.
1GB/s in DNS is quite some data. How big is the pcap? Did you replay with '-t'? We parse the message + generate JSON-text from messages (requires some more memory than original DNS message). Wonder if indexing in elasticsearch generates some back pressure here, forcing internal queue/buffers to fill up. See this github ticket.
Have you tried to reduce the transaction timeout? E.g. for testing set to 1 second.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.