Hello,
I am trying to debug some intermittent slowness in my cluster.
I send this simple term query, to a specific shard, which is local to the node. I also enabled the profile option, so that the profile output is available.
curl -s -X POST "10.2.3.42:9200/my_index/_search?preference=_shards:1|_only_local" -H 'Content-Type: application/json' --data-raw '{"profile": true, "query":{"bool":{"must":[{"term":{"MY_FIELD":"NON_EXISTENT_VALUE"}}],"must_not":[],"should":[]}},"from":0,"size":1,"sort":[],"aggs":{}}
I see in the output that it took is 2.4seconds to complete the search. However, the combined components of the profile output is under 1 millisecond. Where else could it have spent the remaining (~2.4seconds) time?
Thank you!
p.s. each shard is about 8 to 10 gb in size. Each node has 30GB RAM, and 16GB Xmx, and about 120 such shards per node. This query returns quickly most of the times, but some times (about once in 10 times), it takes 2 to 4 seconds. The disk is local to the node. There are 12 threads in the search pool, and no searches are happening when I run this search. (Indexing operations are happening in parallel, and is fine). There is no heavy GC going on.
{
"took": 2439,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"profile": {
"shards": [
{
"id": "[Hou2tOyGS-iS8jSXAQFagw][my_index][1]",
"searches": [
{
"query": [
{
"type": "TermQuery",
"description": "MY_FIELD:NON_EXISTENT_VALUE",
"time_in_nanos": 248063,
"breakdown": {
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 0,
"match": 0,
"next_doc_count": 0,
"score_count": 0,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 0,
"advance_count": 0,
"score": 0,
"build_scorer_count": 30,
"create_weight": 245188,
"shallow_advance": 0,
"create_weight_count": 1,
"build_scorer": 2844
}
}
],
"rewrite_time": 3713,
"collector": [
{
"name": "CancellableCollector",
"reason": "search_cancelled",
"time_in_nanos": 10508,
"children": [
{
"name": "SimpleTopScoreDocCollector",
"reason": "search_top_hits",
"time_in_nanos": 3423
}
]
}
]
}
],
"aggregations": []
}
]
}
}