Is it possible to include other fields in the top level term aggregation

I'm doing an aggregation on devices and calculating a metric on each. Is there a way to include other fields for the device in that top bucket like id, city, state? Each document in the vn-aggr index has those fields in it.

{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "dttm": {
              "gte": "now-24h",
              "lte": "now"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "resources": {
      "terms": {
        "field": "displayname.keyword",
        "size": 10
      },
      "aggs": {
        "90_inpeak": {
          "percentiles": {
            "field": "inpeak_util",
            "percents": [
              90
            ]
          }
        }
      }
    }
  }
}

The result is:

{
  "took" : 21687,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 17984980,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "resources" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 17939432,
      "buckets" : [
        {
          "key" : "c4a24244_utp ethernet (10/100)",
          "doc_count" : 378,
          "90_inpeak" : {
            "values" : {
              "90.0" : 0.01721082963049412
            }
          }
        },
        {
          "key" : "c0a24244_utp ethernet (10/100)",
          "doc_count" : 360,
          "90_inpeak" : {
            "values" : {
              "90.0" : 0.004095323150977492
            }
          }
        },.....

Is there a way to include those identifiers in the term bucket like:

        {
          "key" : "c4a24244_utp ethernet (10/100)",
          "resourceid" : "1234",
          "city" : "Houston",
          "state" : "TX",
          "doc_count" : 378,
          "90_inpeak" : {
            "values" : {
              "90.0" : 0.01721082963049412
            }
          }
        },

You can use the "top hits" aggregation to do this

Perfect, thanks Joe. However, when I include the _id value, I'm getting this warning. I'm not real sure what it is telling me. I'm on v 6.8 but we're migrating to 7.x soon.

GET vn-aggr/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "dttm": {
              "gte": "now-4h",
              "lte": "now"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "resources": {
      "terms": {
        "field": "displayname.keyword",
        "size": 10
      },
      "aggs": {
        "top_flds": {
          "top_hits": {
            "docvalue_fields": [
              "_id"
              ], 
            "_source": {
              "includes": [
                "syslocation",
                "tz-name",
                "a_device_hostname",
                "a_device_ip",
                "ifspeed",
                "resourceidelk"
              ]
            },
            "size": 1
          }
        },
        "avg_inpeak": {
          "avg": {
            "field": "inpeak_util"
          }
        },

#! Deprecation: There are doc-value fields which are not using a format. The output will change in 7.0 when doc value fields get formatted based on mappings by default. It is recommended to pass [format=use_field_mapping] with a doc value field in order to opt in for the future behaviour and ease the migration to 7.0: [_id]

Unfortunately I'm not an expert on Elasticsearch - I know _id behaves weird in a bunch of places, not sure about this one in particular. I think it's worth opening a separate post in the Elasticsearch forum Elasticsearch - Discuss the Elastic Stack

OK. Thanks Joe

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.