[Theoretical] How does query filtering upon nested aggregation work?

I've just used a really neat option - filtering upon nested aggregation values. I love it, it works the way it's supposed to, but I'm interested in how does it work and in which order are the filters/aggregations executed.

This would be a sample query to get the point across:

"query": {
  "bool": {
    "must"" [ 
      {
         "term": {
           "car_producer": "Mercedes" 
         }
      },
      {
         "terms": {
           "nested_aggregation": [4, 8]
         }
      }
    ]
  }
},
"aggs": {
  "agg_results": {
      "terms": {
          "field": "carModels"
       },
       "aggs": {
          "nested_aggregation": {
             "terms": {
                "field": "colours"
             }
          }
       }
   }
}

As you can see, we are filtering all the Mercedes cars.
In the first aggregation, we're aggregating all the car model which Mercedes produced.
In the nested aggregation, we're aggregating the number of colours for each car model.
If we go back to the second term clause in the query part, we want to display results only for cars which have exactly 4 or 8 cars of one colour for a specific car model.

This example is completely made up, but I hope that it gets the point across.

  • How does this actually work?

  • In what order are the queries and aggregations executed?

  • Does this part of the query:

           "terms": {
             "nested_aggregation": [4, 8]
           }
    

    filter out the values from the documents themselves or aggregations buckets?

  • From what I've noticed, in the "hits" it will return all the results even for aggregated buckets whose doc count is larger than 1 (in case 4 aggregated buckets, all will be shown in the hits)?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.