Packetbeat use mem too many when monitor cassandra


(Ymyjohnny) #1

when i use packetbeat to monitor cassandra,the packetbeat use too many mem

and after a while,the packetbeat process will be dead.

version is 5.1.2

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
32614 root 20 0 27.0g 25g 10m S 16.0 20.4 3:02.82 /usr/share/packetbeat/bin/packetbeat -c /etc/packetbeat/packetbeat.yml -path.home /usr/share/packetbeat -path.config /etc/packetbeat -path.data

my config

packetbeat.interfaces.device: any
packetbeat.interfaces.buffer_size_mb: 300
packetbeat.flows:
timeout: 30s
period: 10s
packetbeat.protocols.icmp:
enabled: true
packetbeat.protocols.cassandra:
ports: [9042]
output.elasticsearch:
hosts: ["192.168.32.206"]


(Steffen Siering) #2

do you have some packetbeat logs, and/or pcap file for testing? No idea wether this is a potential leak or incomplete transactions hanging in memory due to packet loss (have you tried af_packet sniffer?).


(Ymyjohnny) #3

my log

2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[stream:19776 op:RESULT length:4 version:3 flags:Default]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[version:3 flags:Default stream:7296 op:RESULT length:248]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[op:RESULT length:4 version:3 flags:Default stream:4992]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[version:3 flags:Default stream:23360 op:RESULT length:4]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[version:3 flags:Default stream:32064 op:RESULT length:4]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[stream:23040 op:RESULT length:4 version:3 flags:Default]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[version:3 flags:Default stream:19840 op:RESULT length:4]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[version:3 flags:Default stream:22208 op:RESULT length:4]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[flags:Default stream:29376 op:RESULT length:4 version:3]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[version:3 flags:Default stream:22144 op:RESULT length:4]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[length:4 version:3 flags:Default stream:30464 op:RESULT]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[op:RESULT length:4 version:3 flags:Default stream:4480]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[version:3 flags:Default stream:8576 op:RESULT length:4]
2017-01-24T11:31:05+08:00 WARN Response from unknown transaction. Ignoring. map[length:4 version:3 flags:Default stream:25792 op:RESULT]
2017-01-24T11:31:13+08:00 INFO Non-zero metrics in the last 30s: libbeat.publisher.messages_in_worker_queues=8251 libbeat.publisher.published_events=17254 libbeat.es.published_and_acked_events=17234 libbeat.es.call_count.PublishEvents=352 libbeat.es.publish.read_bytes=176526 libbeat.es.publish.write_bytes=10844063 tcp.dropped_because_of_gaps=2029
2017-01-24T11:31:41+08:00 INFO Non-zero metrics in the last 30s: libbeat.es.published_and_acked_events=27 libbeat.es.publish.read_bytes=429 libbeat.es.call_count.PublishEvents=1 libbeat.es.publish.write_bytes=18371 libbeat.publisher.messages_in_worker_queues=4 libbeat.publisher.published_events=2051
2017-01-24T11:31:41+08:00 ERR Failed to perform any bulk index operations: Post http://192.168.32.206:9200/_bulk: write tcp 192.168.32.177:60562->192.168.32.206:9200: write: connection reset by peer
2017-01-24T11:31:41+08:00 INFO Error publishing events (retrying): Post http://192.168.32.206:9200/_bulk: write tcp 192.168.32.177:60562->192.168.32.206:9200: write: connection reset by peer
2017-01-24T11:31:42+08:00 INFO Connected to Elasticsearch version 5.1.2
2017-01-24T11:31:42+08:00 INFO Trying to load template for client: http://192.168.32.206:9200
2017-01-24T11:31:42+08:00 INFO Template already exists and will not be overwritten.
2017-01-24T11:31:56+08:00 ERR Failed to perform any bulk index operations: Post http://192.168.32.206:9200/_bulk: write tcp 192.168.32.177:60719->192.168.32.206:9200: write: connection reset by peer
2017-01-24T11:31:56+08:00 INFO Error publishing events (retrying): Post http://192.168.32.206:9200/_bulk: write tcp 192.168.32.177:60719->192.168.32.206:9200: write: connection reset by peer
2017-01-24T11:31:58+08:00 INFO Connected to Elasticsearch version 5.1.2
2017-01-24T11:31:58+08:00 INFO Trying to load template for client: http://192.168.32.206:9200
2017-01-24T11:31:58+08:00 INFO Template already exists and will not be overwritten.
2017-01-24T11:32:10+08:00 ERR Failed to perform any bulk index operations: Post http://192.168.32.206:9200/_bulk: write tcp 192.168.32.177:60720->192.168.32.206:9200: write: connection reset by peer
2017-01-24T11:32:10+08:00 INFO Error publishing events (retrying): Post http://192.168.32.206:9200/_bulk: write tcp 192.168.32.177:60720->192.168.32.206:9200: write: connection reset by peer
2017-01-24T11:32:11+08:00 INFO Non-zero metrics in the last 30s: libbeat.es.publish.write_errors=3 libbeat.es.publish.write_bytes=448308 libbeat.es.call_count.PublishEvents=2 libbeat.es.publish.read_bytes=1096


(Steffen Siering) #4

There are a number of TCP connections being dropped and incomplete due to packet loss:

tcp.dropped_because_of_gaps=2029

Please to the af_packet sniffer, as it's somewhat more performant then the default sniffer.

Plus, it seems your having some problems with the output connection getting closed by ES (or proxy). If output get's stuck a number of transactions is kept in queues.


(Ymyjohnny) #5

i has changed config,but mem used 35G

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
77389 root 20 0 39.3g 35g 30m S 99.9 14.4 2:27.86 /usr/share/packetbeat/bin/packetbeat -c /etc/packetbeat/packetbeat.yml -path.home /usr/share/packetbeat -path.config /etc/packetbeat -path.dat

packetbeat.interfaces.device: any
packetbeat.interfaces.snaplen: 1514
packetbeat.interfaces.type: af_packet
packetbeat.interfaces.buffer_size_mb: 100


(Steffen Siering) #6

have you tried to monitor memory usage over time? How long does it take until RSS becomes big (ignore virtual, as runtime just reserves a hughe virtual address space on startup, which actually is no real allocation)?

Have you checked your logs, if your still having problems with elasticsearch?

you can enable a profiling output in packetbeat by starting packetbeat with -httpprof localhost:6060 (using localhost ensure the endpoint is not reachable from the outside, use -httpprof :6060 if you want to enable remote debugging). Having the profiling endpoint, you can collect sample of memory allocations via curl http://localhost:6060/debug/pprof/heap?debug=1. This call does not add much overhead to packetbeat. If you collect a few traces over time, it let's us see how memory usage 'develops' over time.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.