Composite aggregation max bucket size


(bluren) #1

I have a query that I need to run every 30s to check if there are any docs matching the params.

{
  "version": true,
  "size": 0,
  "sort": [
    {
      "@timestamp": {
        "order": "desc",
        "unmapped_type": "boolean"
      }
    }
  ],
  "_source": {
    "excludes": []
  },
  "aggs": {
    "my_buckets": {
      "composite": {
        "size": 2147483647,   <------ causes cirtuit braker exception
        "sources": [
          {
            "blk_srvPort": {
              "terms": {
                "field": "flow.service_port"
              }
            }
          },
          {
            "src_addr": {
              "terms": {
                "field": "flow.src_addr"
              }
            }
          },
          {
            "dst_addr": {
              "terms": {
                "field": "flow.dst_addr"
              }
            }
          },
          {
            "es_id": {
              "terms": {
                "field": "_id"
              }
            }
          },
          {
            "es_timestamp": {
              "terms": {
                "field": "@timestamp"
              }
            }
          }
        ]
      }
    }
  },
  "stored_fields": [
    "*"
  ],
  "script_fields": {},
  "docvalue_fields": [
    "@timestamp",
    "netflow.first_switched",
    "netflow.last_switched"
  ],
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": """flow.dst_addr: "10.5.6.25" """,
            "analyze_wildcard": true,
            "default_field": "*"
          }
        },
        {
          "range": {
            "@timestamp": {
              "gte": "now-10s",
              "lte": "now"
            }
          }
        }
      ],
      "filter": [],
      "should": [],
      "must_not": []
    }
  }
}

I'm using composite aggregation since I need to know multiple fields inside of each bucket - this works well. However, how do I get all the buckets and make sure none of them are dropped due to predefined limits? I know there is a "size" parameter that can be used inside composite, but what is the max possible value that I can set to the size property? Is there a way I force it to return ALL buckets?


(Zachary Tong) #2

The size parameter of the composite agg is just how many buckets you want per page. The composite agg "paginates" over all the buckets, so it is an exhaustive aggregation that will return all the results once you've fully paginated through it.

See the docs here about specifying an after parameter to keep paging: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html#_after

So basically, you set the size to something that is a reasonable balance between speed (higher size) and memory requirements (lower size). Then you page through the results fully with multiple requests.

Hope that helps!


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.