Hello,
I have recently indexed 25M docs into 2 nodes (+1 master) of 5 shards x 3 Availability Zones. I'm running in AWS ES, large boxes with with plenty of EBS storage, resource usage is low, average doc size is < 1KiB. I'm performing the following terms query with profiling (also very slow without profiling); subscriber_ids
is mapped as a keyword.
curl -X GET "host/index/_search?pretty&human=true" -H 'Content-Type: application/json' -d'
{
"profile": true,
"size": 0,
"query": {
"term": {
"subscriber_ids": {
"value": 17224548,
"boost": 1.0
}
}
}
}
Total took
is ~20 seconds. However, my profiling comes in like the below example with only hundreds of microseconds time, which seem to demonstrate the actual query operations on ES is very fast but there's some other bottleneck. I'm guessing it could be the network? What's the best way to better understand and dive deeper into where the bottleneck in performance is?
{
"id" : "[qgiqODOIQqOlWaprcK3KyQ][flink_groups][2]",
"searches" : [
{
"query" : [
{
"type" : "PointRangeQuery",
"description" : "subscriber_ids:[17224548 TO 17224548]",
"time" : "484.1micros",
"time_in_nanos" : 484190,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 9917,
"advance_count" : 9,
"score" : 0,
"build_scorer_count" : 19,
"create_weight" : 908,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 473336
}
}
],
"rewrite_time" : 14461,
"collector" : [
{
"name" : "EarlyTerminatingCollector",
"reason" : "search_count",
"time" : "13.9micros",
"time_in_nanos" : 13992
}
]
}
],
"aggregations" : [ ]
}
Thanks!