Aggregation Search Query is slow

Hi guys

Cluster Configuration:

There are 3 data Nodes hosted on Elastic Cloud.

The state below is given per Node:

  • ES version: 6.5.1

  • Max Heap size configured per Node: 7.9 GB
    Heap used: 57-61 %

  • Max RAM size configured per Node: 240 GB
    RAM used: 99 %

  • CPU (average): 5-10%

  • Disk available: 386 GB

Our index uses Custom Routing for Multi-tenancy. So all of the queried data is allocated on single Primary Shard + 1 replica.

A Shard to be queried has 27 649 038 docs. The only tenant is indexed on a Shard to be queried.

The following aggregation-query is implemented https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html#_field_collapse_example

Aggregation-search-query:

{
	"size": 0,
	"aggs": {
		"top_sites": {
			"terms": {
				"field": "taskId",
				"size": 10,
				"order": [{
						"top_hit": "desc"
					}
				]
			},
			"aggs": {
				"top_hit": {
					"max": {
						"script": {
							"source": "_score"
						}
					}
				},
				"top_tags_hits": {
					"top_hits": {
						"size": 1
					}
				}
			}
		}
	},
	"query": {
		"bool": {
			"should": [{
					"match": {
						"all_copy_to_field": {
							"boost": 15.0,
							"query": "some sentence with several words",
							"fuzziness": "AUTO",
							"prefix_length": 3,
							"max_expansions": 10
						}
					}
				}, {
					"bool": {
						"should": [{
								"match": {
									"field_1": {
										"query": "some sentence with several words",
										"fuzziness": "AUTO",
										"prefix_length": 3,
										"max_expansions": 10
									}
								}
							}, {
								"match": {
									"field_2": {
										"query": "some sentence with several words",
										"fuzziness": "AUTO",
										"prefix_length": 3,
										"max_expansions": 10
									}
								}
							}, {
								"match": {
									"field_3": {
										"query": "some sentence with several words",
										"fuzziness": "AUTO",
										"prefix_length": 3,
										"max_expansions": 10
									}
								}
							}, {
								"match": {
									"field_4": {
										"query": "some sentence with several words",
										"fuzziness": "AUTO",
										"prefix_length": 3,
										"max_expansions": 10
									}
								}
							}, {
								"match": {
									"field_5.shingle": {
										"query": "some sentence with several words"
									}
								}
							}, {
								"match": {
									"field_5": {
										"query": "some sentence with several words",
										"fuzziness": "AUTO",
										"prefix_length": 3,
										"max_expansions": 10
									}
								}
							}
						]
					}
				}
			],
			"filter": [{
					"term": {
						"communityId": {
							"value": 1
						}
					}
				}
			],
			"minimum_should_match": 1
		}
	}
}

Search performance stat is:

Average ES Server time for Aggregation-search-query ("took" response field): 110 ms

The search-query without aggregation takes 50 ms in average.

So the questions are:

  1. Why aggregation-query is slow (110 ms)?

  2. Can it be improved?

Try remove the script.
Your result from top_tags_hits should have the max score anyway?

Yep, "max_score" is identical to the top-scored document in a bucket. Could you provide proper query?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.