For the last two days my team deals with solving an issue of querying the data from Elasticsearch DB (ES). Our purpose is to get aggregated data by a field from ES with two values accumulated.
If I would translate it to SQL query we need something like that:
SELECT MAX(FIELD1) AS F1, MAX(FIELD2) AS F2 FROM ES GROUP BY FIELD3 HAVING F1 = ‘SOME_TEXT’
Please put attention that F1 is a text field.
The only solution that we found as of now is:
{
	"size": 0 ,
	"aggs": {
		"flowId": {
			"terms": {
				"field": "flowId.keyword"
			},
			"aggs" :{
				"scenario" : { "terms" : { "field" : "scnName.keyword" } },
				"max_time" : { "max" : { "field" : "inFlowTimeNsec" } },
				"sales_bucket_filter": {
					"bucket_selector": {
						"buckets_path": {
							"totalSales": "scenario"
						},
						"script": "params.totalSales != null && params.totalSales > 0"
					}
				}
			}
		}
	}
}
The issue that we encountered is:
{
    "error": {
        "root_cause": [],
        "type": "search_phase_execution_exception",
        "reason": "",
        "phase": "fetch",
        "grouped": true,
        "failed_shards": [],
        "caused_by": {
            "type": "aggregation_execution_exception",
            "reason": "buckets_path must reference either a number value or a single value numeric metric aggregation, got: org.elasticsearch.search.aggregations.bucket.terms.StringTerms"
        }
    },
    "status": 503
}
As far as I understand that issue was already raised: https://github.com/elastic/elasticsearch/issues/23874
The output of the above query without bucket_selector part looks as following:
{
    "took": 52,
    "timed_out": false,
    "_shards": {
        "total": 480,
        "successful": 480,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 15657901,
        "max_score": 0,
        "hits": []
    },
    "aggregations": {
        "flowId": {
            "doc_count_error_upper_bound": 4104,
            "sum_other_doc_count": 9829317,
            "buckets": [
                {
                    "key": "0_66718_31120bfd_39ae_4258_81e8_08abd89a81bf",
                    "doc_count": 107816,
                    "scenario": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "GetPop",
                                "doc_count": 12
                            }
                        ]
                    },
                    "max_time": {
                        "value": 121244876800
                    }
                },
                {
                    "key": "0_67116_31120bfd_39ae_4258_81e8_08abd89a81bf",
                    "doc_count": 107752,
                    "scenario": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "GetPop",
                                "doc_count": 12
                            }
                        ]
                    },
                    "max_time": {
                        "value": 120955101184
                    }
                },
…
}
The question is there any other way to achieve what we need?
Thank you a lot,
EG