Performance of outside Query vs filter inside aggregation while aggregating

Is there any performance difference between the below two quires assuming "type" is a keyword?

    {
      "size": 0,
      "aggs": {
        "t_shirts": {
          "filter": {
            "term": {
              "type": "t-shirt"
            }
          },
          "aggs": {
            "avg_price": {
              "avg": {
                "field": "price"
              }
            }
          }
        }
      }
    }
    {
      "size": 0,
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "type": {
                  "value": "t-shirt"
                }
              }
            }
          ]
        }
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }

There are dozen of discussions about this topic.
A couple of examples

Thanks, @Aclerk. I am more concerned about using the same query outside and inside an aggregation wrt performance.

I’m going to go with the query.
It reads a set of Lucene doc IDs from disk for a given term and then feeds them into the tree of aggregation collectors for consideration.

The query-less version iterates over all (non-deleted) doc IDs in the range 0 to number-of-docs-in-index and feeds those into the agg tree. The first agg in your tree then filters these to see if they match a set of doc IDs from disk for a given term. More iterations (unless there’s some special optimisation for an agg tree with a single-filter at the root which I’m unaware of).

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.