'took' time fast, curl sometimes slow

ryans · February 16, 2024, 12:56am

I am using Elastic Cloud for my Elasticsearch instance. My issue is that sometimes (rarely) I'm seeing strange behavior where the curl time (round trip from my web server to Elastic Cloud) is taking 8+ seconds but the 'took' time in the response is approx. 45ms. It's not the same query every time and doesn't happen often (maybe once or twice a day out of millions of queries). Every time I run the query manually, it's super fast and the Profiler says it looks good.

This is the typical behavior, but about every 5 days, I get a spraying of these where it happens to every query I'm sending to Elastic at the same time. So I end up with about 50-60 queries that are slow in curl time but fast in 'took' time. I do an auto-retry and they work just fine.

My network guy has looked and said we are seeing 0 return packets, not even completing the 3-way handshake from the Elastic Cloud during these spraying events. We have a huge pipe and have no other network events at these times.

Does anyone have any ideas?

This sounds like the web server handling the API requests for Elasticsearch is getting overloaded, but I don't know all the pieces in the stack.

How can 'took' be fast and curl be slow?

Thanks in advance

stephenb · February 16, 2024, 1:13am

Hi @ryans

took is the actual query time within elasticsearch ... The time it took to execute the query in elasticsearch once elasticsearch receives the query and the is ready to return the results

.. it is not the roundtrip http request / response time.

So with the above explanation there are many reasons why curl can take longer.. unfortunately it is my experience that intermittent network delays can be difficult to catch / diagnose

And that is not to say it may be all on your side .. in Elastic Cloud there are some components between the your Elasticsearch Cluster.

There is an Edge Proxy...

But that said if we were have repeated issues I suspect we would be getting a number of alerts / calls...

ryans · February 16, 2024, 3:27pm

Thanks for the response. That was my understanding of what 'took' is, which is why I posted my predicament here.
My network guys says "not us", so I'm just trying to figure out the pieces in between the time that 'took' is calculated and the internet. You mentioned the Edge Proxy. Is there anything else?

stephenb · February 16, 2024, 3:33pm

Do a traceroute and look at all the hops in between... could be any of them...
Intermittent Networking debugging... super hard....

If you can repeat it ... OR you have captured it Support can look on our Side up to our Edge... but there is a lot in between I suspect.

ryans · February 16, 2024, 9:08pm

Thanks. Do you know of any pieces within the Elastic Cloud stack that could be investigated after the 'took' time is calculated?

stephenb · February 16, 2024, 9:56pm

Like I said there is only one component that I ever really looked at... That's the proxy sits in front... But that is highly highly monitored as all our customer traffic passes through those (there are many distributed ) If there's any latency issues the team is usually directly on it.

ryans · February 19, 2024, 1:59pm

Thanks. I thought you knew of some others based on your previous reply.

system · March 18, 2024, 1:59pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Response time vs took Elasticsearch	4	3282	July 13, 2023
Time Difference with "took" and python timer Elasticsearch	6	519	December 28, 2021
Query timing: 'took' value and what I'm measuring Elasticsearch	7	46378	July 6, 2017
High "took" time but low query time Elasticsearch	6	3963	August 19, 2020
Huge variance in query time Elasticsearch	5	416	July 6, 2017

'took' time fast, curl sometimes slow

Related topics