Hi, everyone.
I'm having some trouble trying to setup Packetbeat on a Linux VM with RHEL 6.7 (Linux 2.6.32).
If I use the following settings:
packetbeat.interfaces.device: any
packetbeat.interfaces.type: af_packet
then Packetbeat crashes after not being able to allocate enough memory:
2017/07/26 11:47:11.648739 beat.go:339: CRIT Exiting: Initializing sniffer failed: Error creating sniffer: setsockopt packet_rx_ring: cannot allocate memory
Exiting: Initializing sniffer failed: Error creating sniffer: setsockopt packet_rx_ring: cannot allocate memory
As far as I can tell from what strace shows me, in this case it tries to allocate 3 contiguous blocks, 8MB each:
[pid 22966] 14:47:28 setsockopt(4, SOL_PACKET, PACKET_RX_RING, {block_size=8388608, block_nr=3, frame_size=65536, frame_nr=384}, 16) = -1 ENOMEM (Cannot allocate memory)
The system where I'm trying to run Packetbeat definitely has enough free memory and the fragmentation level is not that high (there are many free 4MB chunks according to /proc/buddyinfo
).
I've tried stopping the service I'm trying to monitor, dropping caches (echo 3 >/proc/sys/vm/drop_caches
), compacting memory (echo 1 > /proc/sys/vm/compact_memory
) and trying to run Packetbeat. This led to the same error. Tuning packetbeat.interfaces.buffer_size_mb
didn't help either.
Now, if I set packetbeat.interfaces.snaplen
to 1514 (network interface MTU is 1500) Packetbeat launches and starts capturing and processing traffic, but I'm seeing a lot of dropped TCP connections:
2017/07/26 11:46:23.885889 metrics.go:39: INFO Non-zero metrics in the last 5s: http.unmatched_responses=6 libbeat.publisher.published_events=16 tcp.dropped_because_of_gaps=70
2017/07/26 11:46:28.885942 metrics.go:39: INFO Non-zero metrics in the last 5s: http.unmatched_responses=4 libbeat.publisher.published_events=14 tcp.dropped_because_of_gaps=194
2017/07/26 11:46:33.886157 metrics.go:39: INFO Non-zero metrics in the last 5s: http.unmatched_responses=8 libbeat.publisher.published_events=15 tcp.dropped_because_of_gaps=122
2017/07/26 11:46:38.886054 metrics.go:39: INFO Non-zero metrics in the last 5s: http.unmatched_responses=6 libbeat.publisher.published_events=14 tcp.dropped_because_of_gaps=134
2017/07/26 11:46:43.885981 metrics.go:39: INFO Non-zero metrics in the last 5s: http.unmatched_responses=9 libbeat.publisher.published_events=25 tcp.dropped_because_of_gaps=117
This happens even with low CPU usage and large buffer_size_mb
values.
I assume this has something to do with TCP segmentation offload.
Setting the snaplen parameter to 32767 helps, but it still drops connections because of gaps from time to time. Setting it any higher leads to Packetbeat refusing to start.
I have another machine with CentOS 7 (Linux 3.10.0) and Packetbeat runs fine there with the default capture length.
Did anybody have any success running Packetbeat with af_packet and default capture length on 2.4/2.6 kernels?