Issue with specifying the correct buckets_path


(Michael Craig) #1

Hello,
I am trying to find the hourly maximum of a date_histogram aggregation with an interval of 1 hour.

I have looked at the documentation for pipeline aggregations, which is useful and interesting but I cannot correctly apply it for my case.

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html#buckets-path-syntax

GET test/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "match": {"Collection": "XSF"}
        }]
    }
  },
  "aggs": {
    "group_by_service": 
      {
        "terms": 
        {"field": "Name.keyword",
        "size": 1},
        "aggs": {
          "hourly": {
          "date_histogram": {
            "field": "@timestamp",
            "interval": "hour",
            "time_zone": "+01:00"
          },
          "aggs": {
            "sum_the_count": {
              "sum": {
                "field": "Count"
            }
          }
        }
      }
    }},
    "max_hourly_sv": {
      "max_bucket": {
        "buckets_path": "group_by_service.sum_the_count"
      }
    }
  }
}

Output that pertains to my question is as follows

"max_hourly_sv": {
      "value": null,
      "keys": []
    }
  }

Using

"buckets_path": "group_by_service>hourly.sum_the_count"

Returns the error

"caused_by": {
      "type": "aggregation_execution_exception",
      "reason": "buckets_path must reference either a number value or a single value numeric metric aggregation, got: java.lang.Object[]"

I feel as though I am misunderstanding or misusing the syntax of buckets_path.

Thanks.

Also (apologies!) how do I format the relevant sections of my question as it looks in Dev Tools.


(Paul McMahon) #2

I have some aggs that do something vaguely similar, I have attempted to adapt your query to the pattern I use - hopefully the syntax is good (I am not in front of an ES cluster at the mo to test it!) :slight_smile:

Basically, I am not sure if the bucket syntax can go that extra level your original query would need (i.e. sibling>child>grandchild). I've never tried! What I have done below is add the 'max_hourly_inner' bucket agg as an intermediary step, so the 'outer' bucket agg has a 'sibling>child' to target... Hope this helps!

{
	"size": 0,
	"query": {
		"bool": {
			"must": [{
				"match": {"Collection": "XSF"}
			}]
		}
	},
	"aggs": {
		"group_by_service": {
			"terms": {"field": "Name.keyword", "size": 1},
			"aggs": {
				"hourly": {
					"date_histogram": {
						"field": "@timestamp",
						"interval": "hour",
						"time_zone": "+01:00"
					},
					"aggs": {
						"sum_the_count": {
							"sum": {
								"field": "Count"
							}
						}
					}
				},
				"max_hourly_inner": {
					"max_bucket": {
					"buckets_path": "hourly>sum_the_count"
					}
				}
			}
		},
		"max_hourly_sv": {
			"max_bucket": {
			"buckets_path": "group_by_service>max_hourly_inner"
			}
		}
	}
}

(Paul McMahon) #3

Oh, and by opening and closing a block of text in three ' symbols, it will make it format like code:

```
hello world()
exit()
```

becomes

hello world()
exit()

(Michael Craig) #4

That has worked perfectly thanks very much! :slight_smile:


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.