Hi all, I'm getting an error when I'm trying to use Pipeline Aggregation with bucket_path more than one (I.e.: "buckets_path" : "aggOne>aggTwo>aggThree"). For example I have query with 'terms' for two fields and some aggregations for those groups, and I want to get aggregation result from those aggregations.
Here is my simple mappings: (I use Elasticsearch 5.1.1)
Author (name=text, gender=text, age=integer)
Book (title=text, pages=integer, _parent=book)
Query below with 'terms' for only one field and aggregation works fine:
GET author/author/_search
{
"size": 0,
"aggs" : {
"gender" : {
"terms" : {
"field" : "gender.keyword"
},
"aggs" : {
"avg" : {
"avg" : {
"field" : "age"
}
}
}
},
"max" : {
"max_bucket" : {
"buckets_path" : "gender>avg"
}
}
}
}
But for query with 'terms' for two fields and aggregation shows an error:
GET author/author/_search
{
"size":0,
"aggs":{
"gender":{
"terms":{
"field":"gender.keyword"
},
"aggs":{
"age":{
"terms":{
"field":"age"
},
"aggs":{
"avg":{
"avg":{
"field":"age"
}
}
}
}
}
},
"max":{
"max_bucket":{
"buckets_path":"gender>age>avg"
}
}
}
}
Error message:
{
"error": {
"root_cause": [],
"type": "reduce_search_phase_exception",
"reason": "[reduce] ",
"phase": "fetch",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "aggregation_execution_exception",
"reason": "buckets_path must reference either a number value or a single value numeric metric aggregation, got: java.lang.Object[]"
}
},
"status": 503
}
Do you have any ideas how to solve this issue? Or Pipeline Aggregations do not support this case? Thank's a lot for any suggestions.
P.S: Actually I've found some workaround how to solve this, but it doesn't look very nice.
Here it is:
{
"size":0,
"aggs":{
"gender":{
"terms":{
"field":"gender.keyword"
},
"aggs":{
"age":{
"terms":{
"field":"age"
},
"aggs":{
"avg":{
"avg":{
"field":"age"
}
}
}
},
"max_avg":{
"max_bucket":{
"buckets_path":"age>avg"
}
}
}
},
"max":{
"max_bucket":{
"buckets_path":"gender>max_avg"
}
}
}
}
I've added one more Pipeline Aggregation to aggregate max average for each 'age' groups, and then my main Pipeline Aggregation aggregates max from aggregated values.
Response look's like this:
"aggregations": {
"gender": {
"buckets": [
{
"key": "male",
"doc_count": 4,
"age": {
"buckets": [
{
"key": 80,
"doc_count": 2,
"avg": {
"value": 80
}
},
{
"key": 22,
"doc_count": 1,
"avg": {
"value": 22
}
},
{
"key": 40,
"doc_count": 1,
"avg": {
"value": 40
}
}
]
},
"max": {
"value": 80,
"keys": [
"80"
]
}
},
{
"key": "female",
"doc_count": 2,
"age": {
"buckets": [
{
"key": 40,
"doc_count": 2,
"avg": {
"value": 40
}
}
]
},
"max": {
"value": 40,
"keys": [
"40"
]
}
}
]
},
"max": {
"value": 80,
"keys": [
"male"
]
}
}