We have Elastic v 7.8.0
We have monthly indices for each client, i.e: client_data_{cliendId}_2020-09.
1 primary shard, 1 replica for each index.
One of the queries are taking too much time, and I'm trying understand why and how to optimize it.
here is the query:
GET client_data_XXX_2020-09/_search?ignore_unavailable=true
{
"track_total_hits": true,
"size": 0,
"query": {
"bool": {
"filter": [
{
"range": {
"date_time": {
"gte": "2020-09-01T00:00:00",
"lt": "2020-09-30T00:00:00"
}
}
},
{
"term": {
"domain": "www.example.com"
}
}
]
}
},
"aggs": {
"date": {
"date_histogram": {
"field": "date_time",
"fixed_interval": "1d",
"min_doc_count": 1
},
"aggs": {
"plt": {
"avg": {
"field": "plt"
}
}
}
}
}
}
this is the basic version of the query and it can take up to 10s, the complex version of the same query can take up to 40s, on index with around 170 million documents.
I thought about aggregating the data daily, but we have some more complex versions of this query, for example, adding more two sub aggregations after the date_histogram, and we will end up with many aggregations, more memory... not the perfect idea.
Any other ideas ?