Nested aggregation with a further nested filter constraint


#1

Hello, I'm using the Java API and am having a hard time making a query that aggregates as intended.

I tried simplifying the problem data.

  • Our root document "authors" have "books" (nested).
  • Books have a format ("ebook" or "hardcover"), and "chapters" (nested).
  • Chapters have a "chapter_type" ("prologue", "regular", "epilogue").

Here are the mappings for that:

curl -XPOST localhost:9200/test_index -d '{
"mappings": {
  "author": {
    "properties": {
      "books": {
        "type": "nested",
        "properties": {
        "format": {
          "type": "string",
          "index": "not_analyzed"
        },
        "chapters": {
          "type": "nested",
          "properties": {
          "chapter_type": {
            "type": "string",
            "index": "not_analyzed"
}}}}}}}}}'

Here is test data:

curl -XPUT 'localhost:9200/test_index/author/Mr_a' -d '{
"books": [ 
  { "format": "ebook",
    "chapters": [
      { "chapter_type" : "prologue" },
      { "chapter_type" : "regular" },
      { "chapter_type" : "regular" },
      { "chapter_type" : "epilogue" }
  ]},
  { "format": "hardcover",
    "chapters": [
      { "chapter_type" : "prologue"  },
      { "chapter_type" : "regular"  }
  ]}
]}'

I try to get the count per format (terms aggregation on books.format) given that I only want books that have an epilogue
(In practice I have all sorts of other constraints and query parts on both "books" and "chapters".)
I tried all sorts of combinations on the aggregation filters and nested filters without success. I either get counts for all books in one author, or nothing at all.
Here's one of the queries I tried, It's not applying the filter constrains as intended, and I'm getting back counts for ebook:1, and hardcover:1, when what i'm expecting is ebook:1 only.

curl -XGET 'localhost:9200/test_index/author/_search' -d '{
  "query" : {
    "bool" : {
      "should" : {
        "nested" : {
          "path" : "books",
          "query" : {
            "filtered" : {
              "query" : {
                "bool" : { }
              },
              "filter" : {
                "and" : {
                  "filters" : [ {
                    "nested" : {
                      "path" : "books.chapters",
                      "filter" : {
                        "and" : {
                          "filters" : [ {
                            "terms" : {
                              "books.chapters.chapter_type" : [ "epilogue" ]
                            }
                          } ]
                 }}} } ]
          }}}},
          "inner_hits" : {
            "name" : "books",
            "_source" : false
    }}}}
  },
  "_source" : false,
 "aggregations" : {
    "filtered_aggregation" : {
      "filter" : {
        "nested" : {
          "filter" : {
            "nested" : {
              "filter" : {
                "term" : {
                  "books.chapters.chapter_type" : "epilogue"
                }
              },
              "path" : "books.chapters"
            }
          },
          "path" : "books"
        }
      },
      "aggregations" : {
        "sub_aggregation_books" : {
          "nested" : {
            "path" : "books"
          },
          "aggregations" : {
            "count_formats" : {
              "terms" : {
                "field" : "books.format"
}}}}}}}}
'

I have made a pastebin for the full data given the length limitations here.
http://pastebin.com/CcjVVezE
This also contains the response to the query above and java snippets for my trials.

Obviously there's something about nested aggregations and filter aggregations that I'm not getting right... can somebody help me?

Thanks!


(system) #2