Hi guys
Cluster Configuration:
There are 3 data Nodes hosted on Elastic Cloud.
The state below is given per Node:
-
ES version: 6.5.1
-
Max Heap size configured per Node: 7.9 GB
Heap used: 57-61 % -
Max RAM size configured per Node: 240 GB
RAM used: 99 % -
CPU (average): 5-10%
-
Disk available: 386 GB
Our index uses Custom Routing for Multi-tenancy. So all of the queried data is allocated on single Primary Shard + 1 replica.
A Shard to be queried has 27 649 038 docs. The only tenant is indexed on a Shard to be queried.
The following aggregation-query is implemented https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html#_field_collapse_example
Aggregation-search-query:
{
"size": 0,
"aggs": {
"top_sites": {
"terms": {
"field": "taskId",
"size": 10,
"order": [{
"top_hit": "desc"
}
]
},
"aggs": {
"top_hit": {
"max": {
"script": {
"source": "_score"
}
}
},
"top_tags_hits": {
"top_hits": {
"size": 1
}
}
}
}
},
"query": {
"bool": {
"should": [{
"match": {
"all_copy_to_field": {
"boost": 15.0,
"query": "some sentence with several words",
"fuzziness": "AUTO",
"prefix_length": 3,
"max_expansions": 10
}
}
}, {
"bool": {
"should": [{
"match": {
"field_1": {
"query": "some sentence with several words",
"fuzziness": "AUTO",
"prefix_length": 3,
"max_expansions": 10
}
}
}, {
"match": {
"field_2": {
"query": "some sentence with several words",
"fuzziness": "AUTO",
"prefix_length": 3,
"max_expansions": 10
}
}
}, {
"match": {
"field_3": {
"query": "some sentence with several words",
"fuzziness": "AUTO",
"prefix_length": 3,
"max_expansions": 10
}
}
}, {
"match": {
"field_4": {
"query": "some sentence with several words",
"fuzziness": "AUTO",
"prefix_length": 3,
"max_expansions": 10
}
}
}, {
"match": {
"field_5.shingle": {
"query": "some sentence with several words"
}
}
}, {
"match": {
"field_5": {
"query": "some sentence with several words",
"fuzziness": "AUTO",
"prefix_length": 3,
"max_expansions": 10
}
}
}
]
}
}
],
"filter": [{
"term": {
"communityId": {
"value": 1
}
}
}
],
"minimum_should_match": 1
}
}
}
Search performance stat is:
Average ES Server time for Aggregation-search-query ("took" response field): 110 ms
The search-query without aggregation takes 50 ms in average.
So the questions are:
-
Why aggregation-query is slow (110 ms)?
-
Can it be improved?