Hi team,
I have a use case where I need to perform aggregations on a keyword
field which has high cardinality. As per the blog https://www.elastic.co/blog/improving-the-performance-of-high-cardinality-terms-aggregations-in-elasticsearch
, Use of eager global ordinals should help in querying the results faster. But still it takes a lot of time. Following is my query.
{
"aggregations": {
"traceIDs": {
"aggregations": {
"startTime": {
"max": {
"field": "startTime"
}
}
},
"terms": {
"field": "traceID",
"order": [
{
"startTime": "desc"
}
],
"size": 20
}
}
},
"query": {
"bool": {
"must": [
{
"range": {
"startTime": {
"from": 1601702100000000,
"include_lower": true,
"include_upper": true,
"to": 1601705700000000
}
}
},
{
"match": {
"process.serviceName": {
"query": "tracerClient"
}
}
}
]
}
},
"size": 0
}'
I see field data increasing during indexing( which is expected) but still the query takes 21s to return the result . The shard is of 8gb size with 40million docs. I see no performance issues in terms of memory/CPU
Is this an expected response time given the high cardinality of traceID
field. Or is there something which can be tuned here?