Applying post_filter to limit inner_hits in nested document

Hi, I'm struggling how to apply a post_filter to some nested documents.

What I'd like to do is have some aggregation on all my nested documents, but have only certain nested documents returned (the general idea of a post_filter). I'm succesfull in applying the post_filter to filter out the root (parent)-documents, but not in filtering the inner hits on this document.

As an example, I've created two documents that each contain 2 people (with a name and age), and what I'd like to have returned is a) a histogram of their ages b) the people who are under 25.

Like this code does: https://gist.github.com/EmilBode/550d3ad44220d1ce57a20dda428b15b5

The code as it is now succesfully filters the right document, but not only gives me Dave (age 20) in the inner_hits, but also Carlos (age 50), I think because he is in the document with Dave.

What I would like to see is the same as I would get with the range-query included in the nested query, but with the aggregation over all the documents.

Is this possible, or should I just work around it by using 2 seperate queries?

Hello,

This was a bit of a headscratcher. I started to wonder if this was possible, but then because of this issue with a related pull request, I knew that it must be.

Then I stopped fixating on the path, and the various parameters available in the inner_hits object of the query, and it became clear.

Just put an inner_hits inside your post_filter.

  "post_filter": {
    "nested": {
      "path": "people",
      "inner_hits": {},
      "query": {
        "range": {
          "people.age": {
            "lte": 25
          }
        }
      }
    }
  }

and in your response you'll have

        "inner_hits" : {
          "people" : {
            "hits" : {
              "total" : {
                "value" : 1,
                "relation" : "eq"
              },
              "max_score" : 1.0,
              "hits" : [
                {
                  "_index" : "temp",
                  "_type" : "_doc",
                  "_id" : "2",
                  "_nested" : {
                    "field" : "people",
                    "offset" : 1
                  },
                  "_score" : 1.0,
                  "_source" : {
                    "name" : "Dave",
                    "age" : 20
                  }
                }
              ]
            }
          }
        }
      }

Thanks, this was exactly what I was looking for!
As you wrote it down it seems perfectly logical to do it this way, yet I was scratching my head as well.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.