Bucket Script Aggregation on Range Aggregation

Sandeep_Krishnankutt · January 6, 2016, 2:33pm

I am trying to do a bucket script aggregation on a range aggregation (Date Histogram being the parent) and is ending up with an exception. The query that I am trying is

{
"aggs": {
    "perhour": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "1h",
        "min_doc_count": 0
        },
      "aggs": {
        "apdsatrange": {
          "range": {
            "field": "timetaken",
            "ranges": [
              {
                "from": 0,
                "to": 500
              }              
            ],
            "keyed": true
          },
		  "aggs" : {
			"sat_count" : { "value_count" : { "field" : "timetaken" } }
			}
        },
		"apdtolrange": {
          "range": {
            "field": "timetaken",
            "ranges": [
              {
                "from": 500,
                "to": 2000
              }              
            ],
            "keyed": true
          },
		  "aggs" : {
			"tol_count" : { "value_count" : { "field" : "timetaken" } }
			}
        },
		"apdcalc": {
			"bucket_script": {
                "buckets_path": {
					"sat": "apdsatrange>sat_count",
					"tol": "apdtolrange>tol_count"
					},
				"script": "sat + tol/2"
			}
		}
      }
    }
  }
}

I have used two different range aggregation so as to refer the count for each in the bucket script. I end up with the following exception on execution

{
  "error" : {
    "root_cause" : [ ],
    "type" : "reduce_search_phase_exception",
    "reason" : "[reduce] ",
    "phase" : "fetch",
    "grouped" : true,
    "failed_shards" : [ ],
    "caused_by" : {
      "type" : "aggregation_execution_exception",
      "reason" : "buckets_path must reference either a number value or a single value numeric metric aggregation, got: java.lang.Object[]"
    }
  },
  "status" : 503
}

Without the bucket aggregation, I am getting the individual range values as expected. eg:

  "aggregations" : {
    "perhour" : {
      "buckets" : [ {
        "key_as_string" : "2016-01-04T00:00:00.000Z",
        "key" : 1451865600000,
        "doc_count" : 4413,
        "apdtolrange" : {
          "buckets" : {
            "500.0-2000.0" : {
              "from" : 500.0,
              "from_as_string" : "500.0",
              "to" : 2000.0,
              "to_as_string" : "2000.0",
              "doc_count" : 48,
              "tol_count" : {
                "value" : 48
              }
            }
          }
        },
        "apdsatrange" : {
          "buckets" : {
            "0.0-500.0" : {
              "from" : 0.0,
              "from_as_string" : "0.0",
              "to" : 500.0,
              "to_as_string" : "500.0",
              "doc_count" : 4357,
              "sat_count" : {
                "value" : 4357
              }
            }
          }
        }
      },....

Can anyone please help me in identifying what is wrong with the query or with a better way of doing this.?
ES version is 2.1

colings86 · January 7, 2016, 8:56am

The problem here is you are trying to get values in the bucket_script aggregation from within a multi-bucket aggregation (the range aggregation). Currently in pipeline aggregations you can only access values that are within a single-bucket aggregation, a metric aggregation, or another pipeline aggregation at the same level.

In you case, since you only have a single range in you range aggregation anyway, you could replace the range aggregations with filter aggregations instead which filter the same range. That way you can reference the values from within your bucket_script aggregation.

Sandeep_Krishnankutt · January 7, 2016, 4:20pm

Thanks for the reply Colin. Is there any chance there would be multi bucket aggregation support in the future.

I had split up my range aggregation to have single range so as to access them via bucket script. Regarding using filter aggregation, wouldn't that still involve having a range query.? I tried the following but still end up with the same error.

"apdsatrange": {
          "filter": {
            "range": {
            "timetaken": {
                "from": 0,
                "to": 500
              }              
          }
	}
      }

colings86 · January 8, 2016, 9:33am

It is something we would like to do but we need to come up with a clean way of doing this.

The problem is, regardless of how many buckets the range aggregation output has, it is still a multi-bucket aggregation, whereas the filter aggregation is a single bucket aggregation (only ever outputs one bucket)

Yes you would need to use a range query in the filter aggregation but this is different from a range aggregation.

Could you post the full aggregations request you used with the filter aggregation (all aggregation in your request) so I can see how you are trying to use the filter aggregation?

Sandeep_Krishnankutt · January 8, 2016, 11:49am

Thanks Colin. Here is what I have tried

{
"aggs": {
    "perhour": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "1h",
        "min_doc_count": 0
        },
      "aggs": {
        "apdsatrange": {
          "filter": {
            "range": {
            "timetaken": {
                "from": 0,
                "to": 500
              }              
          }
		 }
      },
	  "apdtolrange": {
          "filter": {
            "range": {
            "timetaken": {
                "from": 500,
                "to": 2000
              }              
            }
          }
	  },
	  "apdcalc": {
			"bucket_script": {
                "buckets_path": {
					"sat": "apdsatrange",
					"tol": "apdtolrange"
					},
				"script": "sat + tol/2"
			}
		}
    }
  }
}
}

colings86 · January 8, 2016, 3:03pm

That looks mostly correct but I think you just need to change your buckets_paths match the following snippet:

"apdcalc": {
    "bucket_script": {
        "buckets_path": {
            "sat": "apdsatrange>_count",
            "tol": "apdtolrange>_count"
        },
        "script": "sat + tol/2"
    }
}

This indicates you want the doc_count attribute from the filter aggregation.

Let me know if that works.

Sandeep_Krishnankutt · January 12, 2016, 3:44pm

That seems to have worked. Thanks a lot for your help.

Cheers
S

Topic		Replies	Views
Error when trying to aggregate a bucket script with another bucket script Elasticsearch	1	617	September 5, 2019
Bucket script aggregation error Elasticsearch	5	6209	February 2, 2018
Bucket Script Aggregation - parsing_exception Kibana	4	455	October 5, 2020
NPE when using both date_range and bucket script aggregations Elasticsearch	1	598	July 5, 2017
"bucket_script" aggregation initialization exception Elasticsearch	2	862	November 24, 2019

Bucket Script Aggregation on Range Aggregation

Related topics