Generating Correct Nested Filter Aggregation


#1

Hello,

I have a technical question about using nested data type, as expressed in the Query DSL.

Unfortunately, only today I discovered that i'm not building my nested queries as I should be.
I was generating the following wrong query:

GET my_index / blogpost / _search {
	"query" : {
		"bool" : {
			"must" : [{
					"query" : {
						"nested" : {
							"path" : "person",
							"query" : {
								"term" : {
									"person.name" : "george"
								}
							}
						}
					}
				}, {
					"query" : {
						"nested" : {
							"path" : "person",
							"query" : {
								"term" : {
									"person.itemsBought" : {
										"gt" : 5
									}
								}
							}
						}
					}
				}
			]
		}
	}
}

instead of producing the following correct query:

GET my_index / blogpost / _search {
	"query" : {
		"nested" : {
			"path" : "person",
			"query" : {
				"bool" : {
					"must" : [{
							"term" : {
								"person.name" : "george"
							}
						}, {
							"range" : {
								"person.itemsBought" : {
									"gt" : 5
								}
							}
						}
					]
				}
			}
		}
	}
}

Obviously, the second option is suitable, right? Since it guarantees that the filters would be made on the SAME nested document rather than 2 different nested documents.

My real question is about building the correct equivalent nested aggregation query.
Is there a different between the following aggregations below?

Aggregation 1:

GET my_index / blogpost / _search {
	"aggs" : {
		"nested" : {
			"path" : "person"
		},
		"aggs" : {
			"filterByName" : {
				"filter" : {
					"term" : {
						"person.name" : "george"
					}
				},
				"aggs" : {
					"filterByItemsCount" : {
						"filter" : {
							"range" : {
								"person.itemsBought" : {
									"gt" : 5
								}
							}
						}
					}
				}
			}
		}
	}
}

Aggregation 2:

GET my_index / blogpost / _search {
	"aggs" : {
		"nested" : {
			"path" : "person"
		},
		"aggs" : {
			"filterPerson" : {
				"filter" : {
					"bool" : {
						"must" : [{
								"term" : {
									"person.name" : "george"
								}
							}, {
								"range" : {
									"person.itemsBought" : {
										"gt" : 5
									}
								}
							}
						]
					}
				}
			}
		}
	}
}

I believe aggregation 1 might be less optimized, but the results from both Aggregation 1 and Aggregation 2 should be same, right? After all, I believe there is an AND relationship between the 2 filter aggregations in Aggregation 1.

By the way, the full aggregation query involves reverse_nested aggregations and terms aggregation, but yet again - I don't believe it matters involving them into Aggregation 1 nor Aggregation 2 when they are all sub aggregations of each other (and therefore AND related).


#2

Anyone?


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.