I am creating a terms aggregation on a high cardinality field. It takes 4 seconds. That's too long. I notice that if I restructure the query as a composite aggregation, it takes way less time ~400ms. Why might this be? Can the speed improvement be explained by the nature of composite aggregations or is the improvement due to some other feature of my composite aggregation that can be applied to my original terms aggregation?
By the way, my index contains 8 million documents. There are nearly that many unique addresses, i.e. the cardinality of the home_address1 field is in the millions.
Here is the query that takes 4 seconds...
`GET /<indexname>/_search
{
"size": "600",
"timeout": "60s",
"query": {
"bool": {
"must": [
{
"match_all": {
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"track_total_hits": false,
"aggregations": {
"home_address1": {
"terms": {
"field": "home_address1.keyword",
"size": 10,
"order": {
"_key": "asc"
}
}
}
}
}`
and here is the query that takes 400ms.
`GET /<indexname>/_search
{
"size": "600",
"timeout": "60s",
"query": {
"bool": {
"must": [
{
"match_all": {
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"track_total_hits": false,
"aggregations": {
"home_address1": {
"composite": {
"size": 10,
"sources": [
{
"home_address1": {
"terms": {
"field": "home_address1.keyword",
"order": "asc"
}
}
}
]
}
}
}
}`