Http responsetime unrealistic results

jonatanzafar59 · January 27, 2016, 6:35pm

Hi,

I am recording data from a number of Elasticsearch cluster. For some queries, I get huge unrealistic response time numbers (3 mins).
When I run the queries on the clusters, they return in 100ms~.

note: I didn't use PacketBeat template file.
can you please explain why this can happen? how does packetbeat calculate the response time (is it the "took" field from Elasticsearch response?).

tudor · January 27, 2016, 7:31pm

Packetbeat doesn't use the "took" field, instead it looks at the timestamp of the request and the timestamp of the response. At the moment it actually doesn't look into the payload at all, so it doesn't know that you have Elasticsearch running, just that there's an application using HTTP.

If you have the feeling that the values are not realistic, it could be that the request is matched with the response from another request. We call this a correlation problem. Causes could be packet drops or parsing errors.

One way to check for correlation issues is to configure packetbeat to store both the full request and the response (send_request: true, send_response: false and include_body_for: ["application/json"]), then for the transactions that have unrealistic times, check if the two seem to match.

jonatanzafar59 · January 28, 2016, 11:33am

did you mean send_request: true, send_response: true?
because I have this configured already. the "include body for" too.
I do see now that the request and the response don't match (I get a different response when running the query on ElasticSearch directly).

How can I solve this correlation problem?

If this problem has no instant solution, can I use Logstash or some other tool to parse the response from ElasticSearch, and the "took" field and other fields into the JSON object?

jonatanzafar59 · February 4, 2016, 11:31am

Tudor, can you please help me fix this bug?
i can't use this tool because of the above.

tudor · February 4, 2016, 11:53am

Next step would be to identify why is the miss-correlation happening. The most likely reason is that Packetbeat drops messages when reading form the network interface.

Some questions:

How often does the miss-correlation happen? Maybe try putting them on a graph in Kibana (all transaction with responsetime > 30s) and see if they are clustered in some periods or randomly distributed.
What sniffer type do you use in Packetbeat? The defualt pcap or af_packet? Would be good to post the full config.
How many requests per second is Packetbeat seeing
If you do ifconfig eth0 where eth0 is the interface where it is sniffing, do you see any drops or errors?

tudor · February 4, 2016, 11:55am

You could try it like that, but probably won't be very easy.

Depending on what you are trying to accomplish, the slow query log and Marvel might also be helpful.

jonatanzafar59 · February 5, 2016, 10:32pm

Hi Tudor, Thank you for the answer.

this occurs in a randon time frame (I checked the the transactions with a negative response time, which i guess comes from the root cause)
the config:

interfaces:
device: any

protocols:
http:
ports: [9200]
send_request: true
send_response: true
include_body_for: ["text/html", "application/json"]

mysql:
ports: [3306]

mongodb:
ports: [27017, 27019]
send_request: true
send_response: true

output:
redis:
enabled: true
host: "redishostname"
port: 6379

i don't know for sure, because this is still in test environment and i delete yersterday's indices.
I do see drops on some servers, BUT, when i run responsetime < 0, and visualize in kibana for the result's split across the servers, i get that most of the transactions came from servers without any drops.

This drops makes me worry, does the packetbeat has to do anything with the drops? those servers have the same hardware as production, and with the same exact traffic, i get drops on the test and not on prod.

jonatanzafar59 · February 10, 2016, 12:18am

Tudor, what should we do about it?

I used Logstash to parse the response, and use the took data as a new field, but I have a lot of transactions where the request doesn't match the response.

steffens · February 10, 2016, 3:44pm

Is there a chance to get a trace with raw network packets? I'd like to have a look and see if we can improve correlation.

jonatanzafar59 · February 11, 2016, 11:29am

I can't get the production traffic out.
Ask me anything, I really need the traffic to correlate correctly.

jonatanzafar59 · February 15, 2016, 5:43pm

Issue seem (!) to be resolved after upgrading (packetbeat's) Elasticsearch to 2.2.0 version (no sure why this is related) .
Edit: ** this did not solve the problem, just mitigated it

Rachit_Puri · April 22, 2016, 12:42pm

Any updates on this problem, i can see response time to be 0,1 for most of my traffic

harshafrnd4u · June 28, 2016, 11:22am

Even I see that http responsetimes in either 0,1 and why not decimals? Is it rounding off?

Akashi_Seih · January 4, 2017, 1:12pm

I have the same problem. The response time for queries is mostly 0. Is this a bug?

harshafrnd4u · January 4, 2017, 1:42pm

Its been long I tested, I couldn't solve that at that time. I didn't dig deep into it, not sure if it is a bug.

harshafrnd4u · March 17, 2017, 11:28am

Hi, Are you able to figure out solution to this problem?

harshafrnd4u · March 17, 2017, 11:33am

@Akashi_Seih @jonatanzafar59 @Rachit_Puri Hi, Can you please help me if anyone if figured out the solution to this problem?

Topic		Replies	Views
Packetbeat capturing responsetime and sending it to Logstash Beats packetbeat	3	451	December 2, 2020
PacketBeat (Elasticsearch) mostly shows zero as response time for queries Beats packetbeat	3	2165	February 1, 2017
Packetbeat recording high responce time for connections Beats packetbeat	4	394	October 6, 2022
ResponceTime data discrepancy in packetbeats records Beats packetbeat	2	969	July 5, 2017
PacketBeat - negative responsetime Beats packetbeat	1	633	September 20, 2017

Http responsetime unrealistic results

Related topics