Is it possible to use a pipeline aggregation to access information contained in the result of a top_hits
aggregation? If so, what's the correct buckets_path
to use?
Here's an illustrative example.
POST testcalc1/test1/_bulk
{"index" : {} }
{"counter" : 1, "number" : 1}
{"index" : {} }
{"counter" : 2, "number" : 2}
{"index" : {} }
{"counter" : 3, "number" : 3}
{"index" : {} }
{"counter" : 4, "number" : 4}
{"index" : {} }
{"counter" : 5}
{"index" : {} }
{"counter" : 6}
{"index" : {} }
{"counter" : 7, "number" : 7}
{"index" : {} }
{"counter" : 8, "number" : 8}
{"index" : {} }
{"counter" : 9}
{"index" : {} }
{"counter" : 10, "number" : 10}
POST testcalc1/test1/_search?filter_path=aggregations
{
"size": 0,
"aggs": {
"buckets": {
"histogram": {
"field": "counter",
"interval": 20
},
"aggs": {
"top_hits_agg": {
"top_hits": {
"size": 1
}
}
}
}
}
}
These data and simple aggregation (using only one trivial bucket to keep the result small, but same idea for proper histogram buckets) will give this result:
{
"aggregations": {
"buckets": {
"buckets": [
{
"key": 0,
"doc_count": 10,
"top_hits_agg": {
"hits": {
"total": 10,
"max_score": 1,
"hits": [
{
"_index": "testcalc1",
"_type": "test1",
"_id": "AVxHgAkzM60NS_VZXIZ-",
"_score": 1,
"_source": {
"counter": 2,
"number": 2
}
}
]
}
}
}
]
}
}
}
However, if I want to access number
in the top_hits
using bucket_script
, the following two aggregations return errors.
POST testcalc1/test1/_search?filter_path=aggregations
{
"size": 0,
"aggs": {
"buckets": {
"histogram": {
"field": "counter",
"interval": 20
},
"aggs": {
"top_hits_agg": {
"top_hits": {
"size": 1
}
},
"want_info_from_top_hits" : {
"bucket_script": {
"buckets_path": {
"top_hits_var" : "top_hits_agg"
},
"script": "params.top_hits_var[1].number"
}
}
}
}
}
}
which gives
{
"error": {
"root_cause": [],
"type": "search_phase_execution_exception",
"reason": "",
"phase": "fetch",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "aggregation_execution_exception",
"reason": "buckets_path must reference either a number value or a single value numeric metric aggregation, got: org.elasticsearch.search.aggregations.metrics.tophits.InternalTopHits"
}
},
"status": 503
}
While this
POST testcalc1/test1/_search?filter_path=aggregations
{
"size": 0,
"aggs": {
"buckets": {
"histogram": {
"field": "counter",
"interval": 20
},
"aggs": {
"top_hits_agg": {
"top_hits": {
"size": 1
}
},
"want_info_from_top_hits" : {
"bucket_script": {
"buckets_path": {
"top_hits_var" : "top_hits_agg.hits"
},
"script": "params.top_hits_var[1].number"
}
}
}
}
}
}
returns this
{
"error": {
"root_cause": [],
"type": "search_phase_execution_exception",
"reason": "",
"phase": "fetch",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "path not supported for [top_hits_agg]: [hits]"
}
},
"status": 503
}
Other paths I've tried, such as top_hits_agg>hits
and top_hits_agg.hits.hits[1]
, all fail with "reason": "No aggregation found for path ...
.
Any takers?