I am using Elasticsearch 2.4 and upgrading it is not possible at the moment, since the mappings have to be changed. So directly to the question.
I am trying to get different metrics for around 50 groups.
I am using terms aggregation 2 times, and nested aggregations too. I am querying on index size of around 2 TB.
My query looks like this:
Query:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"nested": {
"path": "datefield",
"filter": {
"bool": {
"must": [
{
"range": {
"datefield.fromDate": {
"from": null,
"to": "2018-12-31"
}
}
},
{
"range": {
"datefield.toDate": {
"from": "2018-12-31",
"to": null
}
}
}
]
}
}
}
}
]
}
},
"aggregations": {
"Group1": {
"filter": {
"nested": {
"path": "field1",
"query": {
"bool": {
"must": [
{
"term": {
"field1.id": "Something"
}
},
{
"terms": {
"field1.type": [
"valueA",
"valueB",
"valueC"
]
}
},
{
"range": {
"field1.fromDate": {
"include_lower": true,
"include_upper": true,
"from": null,
"to": "2018-12-31"
}
}
},
{
"range": {
"field1.toDate": {
"include_lower": true,
"include_upper": true,
"from": "2018-12-31",
"to": null
}
}
}
]
}
}
}
},
"aggregations": {
"people": {
"terms": {
"field": "field2",
"size": 0
},
"aggregations": {
"amount": {
"nested": {
"path": "field3"
},
"aggregations": {
"total_paid": {
"filter": {
"bool": {
"must": [
{
"range": {
"field3.month": {
"include_lower": true,
"include_upper": true,
"from": "2015-01-01",
"to": "2018-12-31"
}
}
},
{
"range": {
"field3.differentmonth": {
"include_lower": true,
"include_upper": true,
"from": "2015-01-01",
"to": "2018-12-31"
}
}
},
{
"term": {
"field3.field1[flatField]": "value1"
}
}
]
}
},
"aggregations": {
"sum_amt": {
"sum": {
"field": "field3.value2"
}
}
}
}
}
},
"Score": {
"nested": {
"path": "field4"
},
"aggs": {
"DateFilter": {
"filter": {
"bool": {
"must": [
{
"term": {
"field4.date": "2019-04-30"
}
}
]
}
},
"aggs": {
"ScoreValue": {
"terms": {
"field": "field4.value3",
"size": 0
}
}
}
}
}
}
}
},
"Age": {
"avg": {
"field": "age"
}
},
"Gender": {
"terms": {
"field": "gender",
"size": 0
}
}
}
}
}
}
So what I am trying to do here is, group by a metric, and based on that metric calculate people related to that group. Then calculate sum of each individual's amount(individual may contain multiple amounts, I need the sum of all the amount based on the date period), and again calculate score of each individual.
This sample query consists of how Group1 is calculated. Likewise 50 Groups are calculated in this order.
The Elasticsearch execution is taking a lot of time. Any different approach to solve the timing issue? The response is very big, and the size is around 20-30 Mbs.
Any open suggestion will be well appreciated.
Thanks.