Debug Unmatched responses

Hi, I'm noticing a lot of http requests that aren't being sent through to logstash via packetbeat. When I stop packetbeat I notice a ton of http.unmatched_responses and tcp.dropped_because_of_gaps:

INFO Total non-zero values:  beat.info.uptime.ms=687480148 beat.memstats.gc_next=171292832 
beat.memstats.memory_alloc=115671728 beat.memstats.memory_total=62745787971712 
http.unmatched_responses=17974428 libbeat.config.module.running=0 
libbeat.output.events.acked=632178 libbeat.output.events.batches=224225 
libbeat.output.events.failed=341 libbeat.output.events.total=632519 
libbeat.output.read.bytes=1343814 libbeat.output.type=logstash 
libbeat.output.write.bytes=297294233 libbeat.output.write.errors=256 libbeat.pipeline.clients=13 
libbeat.pipeline.events.active=0 libbeat.pipeline.events.filtered=93530152 
libbeat.pipeline.events.published=632178 libbeat.pipeline.events.retry=722 
libbeat.pipeline.events.total=94162330 libbeat.pipeline.queue.acked=632178 
tcp.dropped_because_of_gaps=9308590
2018/02/14 16:51:51.724844 metrics.go:52: INFO Uptime: 190h58m0.486932728s

Configs are pretty close to the default. Any ideas on what could be causing this or how to approach debugging this issue?

You'll probably need to due some tuning if you are on the defaults. In particular switch away from using pcap to af_packet. See https://www.elastic.co/guide/en/beats/packetbeat/6.2/configuration-interfaces.html

There was some good discussion in this ticket and the linked and the ticket it links to: Performance of Packetbeat?

Thanks. I read through that, but don't think my situation has anything to do with an outdated kernel.

Our requests are generally larger than what you would probably typically expect (anywhere from 0-30mb) and responses could take multiple seconds. Would increasing buffer_size_mb directly help our situation? Or potentially tweaking our internal queue memory to flush more frequently? Is there anything else that could be tuned to not lose packets?

Did you switch to af_packet? I would try that with an increased buffer_size_mb value. I would expect this to help with tcp.dropped_because_of_gaps.

However if the individual request or response is greater than 10MB Packetbeat will drop the transaction. You can increase the limit by setting max_message_size.

If you enable debug for http you should see a log message anytime this happens.

Yep, I switched to af_packet and set the buffer_size_mb to 4028. I think I've discovered the issue to be the include_body_for...

I currently have this set under http protocol:

packetbeat.protocols:
- type: http
    # Configure the ports where to listen for HTTP traffic. You can disable
    # the HTTP protocol by commenting out the list of ports.
    ports: [9200]
    send_request: true
    send_response: false
    include_body_for: ["application/json", "x-www-form-urlencoded"]
    transaction_timeout: 5s

When I remove the include_body_for, I can see all traffic coming through with no issue. But as soon as I enable it, my performance tanks. I believe it's because our response bodies are huge and causes Packetbeat to be unable to keep up? Is there a way to only include the request body (I just want to see the incoming query for Elasticsearch where Packetbeat is installed)?

AFAICT there isn't way to choose to only include one or the other. That seems like a reasonable feature to have if you want to open an enhancement request.

I was also wondering if you have disabled flows since you only want to log ES queries? Have you set packetbeat.flows.enabled: false? That should allow Packetbeat to set a more narrow BPF filter.

I currently have flows enabled. This is mostly to handle the case where a really long running query comes in. I don't want it to be stuck in the buffer for super long, so I have a fairly small timeout (3sec right now). Instead, I just want to log it and cap it at X seconds (behaving somewhat similar to a slow query log).

Is this the best way to do that? Or is there somewhere else to do this that would be more efficient?

If you are only interested in traffic on 9200 then I suggest using a custom BPF filter while still having flows enabled.

packetbeat.interfaces.bpf_filter: tcp port 9200

This will stop any other traffic from being processed by Packetbeat. This might help some.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.