How to get aggregated data by a field from ES with two values accumulated

For the last two days my team deals with solving an issue of querying the data from Elasticsearch DB (ES). Our purpose is to get aggregated data by a field from ES with two values accumulated.
If I would translate it to SQL query we need something like that:

SELECT MAX(FIELD1) AS F1, MAX(FIELD2) AS F2 FROM ES GROUP BY FIELD3 HAVING F1 = ‘SOME_TEXT’

Please put attention that F1 is a text field.

The only solution that we found as of now is:

{
	"size": 0 ,
	"aggs": {
		"flowId": {
			"terms": {
				"field": "flowId.keyword"
			},
			"aggs" :{
				"scenario" : { "terms" : { "field" : "scnName.keyword" } },
				"max_time" : { "max" : { "field" : "inFlowTimeNsec" } },
				"sales_bucket_filter": {
					"bucket_selector": {
						"buckets_path": {
							"totalSales": "scenario"
						},
						"script": "params.totalSales != null && params.totalSales > 0"
					}
				}
			}
		}
	}
}

The issue that we encountered is:

{
    "error": {
        "root_cause": [],
        "type": "search_phase_execution_exception",
        "reason": "",
        "phase": "fetch",
        "grouped": true,
        "failed_shards": [],
        "caused_by": {
            "type": "aggregation_execution_exception",
            "reason": "buckets_path must reference either a number value or a single value numeric metric aggregation, got: org.elasticsearch.search.aggregations.bucket.terms.StringTerms"
        }
    },
    "status": 503
}

As far as I understand that issue was already raised: https://github.com/elastic/elasticsearch/issues/23874

The output of the above query without bucket_selector part looks as following:

{
    "took": 52,
    "timed_out": false,
    "_shards": {
        "total": 480,
        "successful": 480,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 15657901,
        "max_score": 0,
        "hits": []
    },
    "aggregations": {
        "flowId": {
            "doc_count_error_upper_bound": 4104,
            "sum_other_doc_count": 9829317,
            "buckets": [
                {
                    "key": "0_66718_31120bfd_39ae_4258_81e8_08abd89a81bf",
                    "doc_count": 107816,
                    "scenario": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "GetPop",
                                "doc_count": 12
                            }
                        ]
                    },
                    "max_time": {
                        "value": 121244876800
                    }
                },
                {
                    "key": "0_67116_31120bfd_39ae_4258_81e8_08abd89a81bf",
                    "doc_count": 107752,
                    "scenario": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "GetPop",
                                "doc_count": 12
                            }
                        ]
                    },
                    "max_time": {
                        "value": 120955101184
                    }
                },
…
}

The question is there any other way to achieve what we need?

Thank you a lot,
EG

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.