Why is my composite aggregation faster than a terms bucket aggregation for a high cardinality field?

gideongrossman · April 29, 2020, 7:32am

I am creating a terms aggregation on a high cardinality field. It takes 4 seconds. That's too long. I notice that if I restructure the query as a composite aggregation, it takes way less time ~400ms. Why might this be? Can the speed improvement be explained by the nature of composite aggregations or is the improvement due to some other feature of my composite aggregation that can be applied to my original terms aggregation?

By the way, my index contains 8 million documents. There are nearly that many unique addresses, i.e. the cardinality of the home_address1 field is in the millions.

Here is the query that takes 4 seconds...

`GET /<indexname>/_search
    {
        "size": "600",
        "timeout": "60s",
        "query": {
            "bool": {
                "must": [
                    {
                        "match_all": {
                            "boost": 1
                        }
                    }
                ],
                "adjust_pure_negative": true,
                "boost": 1
            }
        },
        "track_total_hits": false,
        "aggregations": {
            "home_address1": {
                "terms": {
                    "field": "home_address1.keyword",
                    "size": 10,
                    "order": {
                        "_key": "asc"
                    }
                }
            }
        }
    }`

and here is the query that takes 400ms.

`GET /<indexname>/_search
    {
    "size": "600",
    "timeout": "60s",
    "query": {
        "bool": {
            "must": [
                {
                    "match_all": {
                        "boost": 1
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1
        }
    },
    "track_total_hits": false,
    "aggregations": {
        "home_address1": {
            "composite": {
                "size": 10,
                "sources": [
                    {
                        "home_address1": {
                            "terms": {
                                "field": "home_address1.keyword",
                                "order": "asc"
                            }
                        }
                    }
                ]
            }
        } 
    }
}`

system · May 27, 2020, 7:32am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch terms aggregation taking 5 seconds on 5 million documents Elasticsearch	7	2001	August 19, 2019
Terms aggregation on high cardinality field Elasticsearch aggregations	9	45	November 14, 2024
Elasticsearch Aggregation time Elasticsearch	6	382	July 6, 2017
Very different aggregation speed for similar fields with different cardinality Elasticsearch	2	341	April 2, 2017
Aggregation Sum is very slow Elasticsearch	1	537	October 9, 2018

Why is my composite aggregation faster than a terms bucket aggregation for a high cardinality field?

Related topics