Hello,
We are trying to identify the source of high latency in our searches to our Elasticsearch cluster. Elasticsearch returns one field called "took" which is the time a request takes within ES (i.e. discounting time spent in transit over the network) in milliseconds. We've been using the Profile API to get detailed timing information on the request, and we're seeing ES requests report a "took" time of 4000ms - 5000ms for ~20% of our queries. Looking at the information provided in the "profile" section, we see that actual query execution took 1.7ms.
As I understand it, took time = query time + fetch time.
To reduce the risk of high fetch times, we turned off _source
retrieval and added a routingID to our searches to ensure only 1 shard is ever involved. We are able to reproduce this issue from adhoc search requests with no other search traffic, so I don't believe it is caused by node thread contention or other scaling issues. We forced caching to be off at the request level but the same request still sometimes exhibits this behavior and at other times does not, executing in less than 10ms, so the issue is not deterministically tied to the search query.
I have two questions:
- Has anyone seen anything similar or have any ideas where else latency could be introduced?
- We've pretty much ruled out query time, and from our experiments fetch time seems good, but is there an API that explicitly gives us fetch time?
Sample search request output:
{
"took": 5458,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 111,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"profile": {
"shards": [
{
"id": "[truncated][truncated][11]",
"searches": [
{
"query": [
{
"type": "BooleanQuery",
"time": "1.7ms",
"time_in_nanos": 1728701,
"breakdown": {
...truncated...
},
"children": [
...truncated...
]
}
],
"rewrite_time": 8160,
"collector": [
{
"name": "CancellableCollector",
"reason": "search_cancelled",
"time": "143.1micros",
"time_in_nanos": 143186,
"children": [
{
"name": "EarlyTerminatingCollector",
"reason": "search_count",
"time": "46.5micros",
"time_in_nanos": 46599
}
]
}
]
}
],
"aggregations": []
}
]
}
}
Elasticsearch version: 7.1