Aggregate bucket values with SUM

HI :slight_smile:

I'm new to elastic, and based on a reqeust constructed by a visualization on Kibana along with few tips, built query aggregates a 'policy' key > 'device' (or devices) it's on - and groups the values based on MAX 'bps' values matrix.

Issue is that I'm getting the max bps under a 'policy', but in case that policy is seen at that query time on two (or more) different devices, the value is the highest bps value, instead of the SUM of all bps values combined for that policy, taken from all devices it on in the response.

I understood that there's an option to 'sum_bucket' after the response returns, that way have all unique policy values being SUMmed from all devices it's seen on.

So the need is to SUM the 'bps' values of all unique policy values, taken from the devices that the unique policy is seen on.

I guess that -

"sum_bucket": {....}

needs to be entered at the bottom, but I can't get the syntax right and I'm being returned with errors like:

  "error" : {
    "root_cause" : [
      {
        "type" : "parsing_exception",
        "reason" : "Unknown key for a START_OBJECT in [sum_bucket].",

Current query is:

{
  "aggs": {
    "2": {
      "terms": {
        "field": "policy",
        "order": {
          "1": "desc"
        },
        "size": 30
      },
      "aggs": {
        "1": {
          "max": {
            "field": "bps"
          }
        },
        "3": {
          "terms": {
            "field": "device.name",
            "order": {
              "1": "desc"
            },
            "size": 180
          },
          "aggs": {
            "1": {
              "max": {
                "field": "bps"
              }
            }
          }
        }
      }
    }
  },
  "size": 0,
  "fields": [
    {
      "field": "@timestamp",
      "format": "date_time"
    },
    {
      "field": "event.duration",
      "format": "strict_date_time"
    }
  ],
  "script_fields": {},
  "stored_fields": [
    "*"
  ],
  "_source": {
    "excludes": []
  },
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "bool": {
            "should": [
              {
                "range": {
                  "bps": {
                    "gte": "10000"
                  }
                }
              }
            ],
            "minimum_should_match": 1
          }
        },
        {
          "range": {
            "@timestamp": {
              "format": "strict_date_optional_time",
              "gte": "now-10m",
              "lte": "now"
            }
          }
        }
      ],
      "should": [],
      "must_not": [
        {
          "match_phrase": {
            "metadata.afi": "IPv6"
          }
        }
      ]
    }
  }
}

Response is -

  "aggregations" : {
    "2" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "high_policy",
          "doc_count" : 1408,
          "1" : {
            **"value" : 6751000.0**
          },
          "3" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "device_a",
                "doc_count" : 260,
                "1" : {
                  **"value" : 6751000.0**
                }
              },
              {
                "key" : "device_b",
                "doc_count" : 179,
                "1" : {
                  **"value" : 1514000.0**
                }
              },
...
..
.

As you see, it takes the higher value of the two devices it's seen on, while the need is to SUM both values and return it under each unique policy (in this case it's only one in the result).

Appreciate the help,
Tamir

Anyone has an idea?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.