Remove buckets from being returned in Aggregation

Is there a way to prevent the buckets array from being returned as a part of the aggregation result? I am using a sum_bucket to total all of the buckets and I don't need all of the buckets to be returned. I'm just using key and displayCount. I worry that with my dataset, there could be hundreds of buckets returned for hundreds of keys.

Meaning, if my default result is below:

{
  "aggregations" : {
    "Sports" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "Football",
          "doc_count" : 5,
          "States" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "Maryland",
                "doc_count" : 1,
                "countStates" : {
                  "value" : 1
                }
              },
              {
                "key" : "New York",
                "doc_count" : 1,
                "countStates" : {
                  "value" : 1
                }
              },
              {
                "key" : "Ohio",
                "doc_count" : 1,
                "countStates" : {
                  "value" : 1
                }
              },
              {
                "key" : "Pennsylvania",
                "doc_count" : 1,
                "countStates" : {
                  "value" : 1
                }
              },
              {
                "key" : "Virginia",
                "doc_count" : 1,
                "countStates" : {
                  "value" : 1
                }
              }
            ]
          },
          "displayCount" : {
            "value" : 5.0
          }
        },
        {
          "key" : "Baseball",
          "doc_count" : 3,
          "States" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "Maryland",
                "doc_count" : 1,
                "countStates" : {
                  "value" : 1
                }
              },
              {
                "key" : "Ohio",
                "doc_count" : 1,
                "countStates" : {
                  "value" : 1
                }
              },
              {
                "key" : "Pennsylvania",
                "doc_count" : 1,
                "countStates" : {
                  "value" : 1
                }
              }
            ]
          },
          "displayCount" : {
            "value" : 3.0
          }
        }
	}
}

Can I remove the buckets array within the fields to only return the following:

{
  "aggregations" : {
    "Sports" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "Football",
          "doc_count" : 5,
          "States" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0
          }
          "displayCount" : {
            "value" : 5.0
          }
        },
        {
          "key" : "Baseball",
          "doc_count" : 3,
          "States" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
          },
          "displayCount" : {
            "value" : 3.0
          }
        }
  }
}

I'll add that I am aware I could use filter_paths list to get what I want, but that feels like overkill for removing just one array from the entire result set. Also I am using the Java API, which filter_paths is not available.

Hi Eric,
I'm not sure what produces the displayCount in your response?

I suspect you may be interested in the cardinality aggregation

I'm not familiar enough with cardinality to know if it work with my use case, but I am more than happen to try if it does. Essentially I am creating a faceted search and I want my categories to display a count of potential results if that facet is selected. I can accomplish this with an aggs query in my aggregation like shown below.

Can cardinality do something similar? I am working with around 2 million documents so I worry if the solution is to run a script that performance would suffer.

{
  "query": {
    "match_all": {}
  },
  "post_filter": {
    "bool": {
      "must": [
        {
          "terms": {
            "sport.name.keyword": [
              "Football"
            ]
          }
        }
      ]
    }
  },
  "aggregations": {
    "States": {
      "terms": {
        "field": "state.name.keyword"
      },
      "aggs": {
        "Sports": {
          "terms": {
            "field": "sport.name.keyword",
            "include": [
              "Football"
            ],
            "size": 10000
          },
          "aggs": {
            "countSports": {
              "value_count": {
                "field": "_id"
              }
            }
          }
        },
        "displayCount": {
          "sum_bucket": {
            "buckets_path": "Sports>countSports"
          }
        }
      }
    }
  }
}

When you swap a terms agg for a cardinality agg you’re effectively asking for a count of how many terms buckets would match without seeing the string values of any of them.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.