Query with aggregation and return single field in each bucket

seanwang30 · February 12, 2018, 5:56pm

Hi all:

Sorry for the weird title. I'm trying to come up with a query solution that gives us both results for an aggregation and certain field of all docs within each bucket, but couldn't find an optimal way.

Currently we store host data with tags associated with each host. We want our app to be able to aggregate by one or more tags, and also return the host ids within each group of hosts. Right now the solution is fairly complicated: we store tags as nested objects, something like {tag_key: "zone": tag_val: "east-1a"} in each host doc, when people group hosts by zone, the query is:

{
  "aggs": {
    "tags": {
      "aggs": {         ------------------------- 1
        "tag_key": {
          "filter": {
            "term": {
              "tags_nested.tag_key": "zone"
            }
          },
          "aggs": {          ------------------- 2
            "tag_val": {
              "terms": {
                "field": "tags_nested.tag_val",
                "size": 300000
              },
              "aggs": {  ------------------------ 3
                "hosts": {
                  "aggs": {  ---------------------- 4
                    "node_id": {
                      "terms": {
                        "field": "id",
                        "size": 300000
                      }
                    }
                  },
                  "reverse_nested": {}
                }
              }
            }
          }
        }
      },
      "nested": {
        "path": "tags_nested"
      }
    }
  },
  "size": 0
}

step 1: filter down to hosts that have the particular tag key we are grouping by
step 2: “terms” aggregation (aka bucketing) to bucket by all possible values of that tag key
step 3: “reverse nested” aggregation to pop out of the nested document back into the main host document
step 4: another “terms” aggregation to bucket by host id (there should only be one per bucket)

This query gives us what we want, but knowing that the dataset would grow much larger, the query can't stay long performance wise. What would be a better way of doing this?

system · March 12, 2018, 5:56pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Aggregation query syntax - assistance needed Elasticsearch	1	335	August 5, 2020
Writing aggregate with filtering Elasticsearch	5	5002	October 30, 2019
Empty bucket while aggregating on nested documents Elasticsearch	2	525	July 17, 2017
Aggregation: trouble bucketing over a text field value Elasticsearch	1	615	July 6, 2017
Help with aggregations Elasticsearch	4	1863	December 25, 2017

Query with aggregation and return single field in each bucket

Related topics