Inconsistent behaviour when using include inside filter aggregation

mpritsch · November 29, 2016, 12:25pm

We noticed some inconsistencies when using filter aggregations in combination with includes/excludes.
Specifying an include term which results in a doc_count of 0 is only returned as bucket if the overall doc_count is > 0.

Below are two documents of type 'book' which have the fields title, author and narrator. Both documents have the same author.

PUT filter
{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "book": {
      "properties": {
        "title": {
          "type": "keyword"
        },
        "author": {
          "type": "keyword"
        },
        "narrator": {
          "type": "keyword"
        }
      }
    }
  }
}

Documents:

PUT filter/book/1
{
    "title" : "Mango",
    "author" : "Winton",
    "narrator" : "Moritz"
}

PUT filter/book/2
{
    "title" : "Banana",
    "author" : "Winton",
    "narrator" : "Max"
}

The following is an aggregation on the field "title" specifying an include of "Mango" setting "min_doc_count: 0" to include buckets with no matching documents.
The query will match book 2.
The subaggregation on field title is performed using book 2 only.

POST filter/book/_search
{
  "size": 0,
  "aggregations": {
    "titleIncluded": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "title": [
                  "Banana"
                ]
              }
            },
            {
              "bool": {
                "must_not": [
                  {
                    "terms": {
                      "narrator": [
                        "Moritz"
                      ]
                    }
                  }
                ]
              }
            }
          ]
        }
      },
      "aggregations": {
        "titleSubAggregation": {
          "terms": {
            "field": "title",
            "min_doc_count": 0,
            "include": [
              "Mango"
            ]
          }
        }
      }
    }
  }
}

The subaggregation results in a bucket for "Mango" with doc_count being 0.

Result:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "titleIncluded": {
      "doc_count": 1,
      "titleSubAggregation": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": "Mango",
            "doc_count": 0
          }
        ]
      }
    }
  }
}

The only thing we changed below is the query which does not match any of the two documents.

POST filter/book/_search
{
  "size": 0,
  "aggregations": {
    "titleIncluded": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "narrator": [
                  "Max"
                ]
              }
            },
            {
              "bool": {
                "must_not": [
                  {
                    "terms": {
                      "title": [
                        "Banana"
                      ]
                    }
                  }
                ]
              }
            }
          ]
        }
      },
      "aggregations": {
        "titleSubAggregation": {
          "terms": {
            "field": "title",
            "min_doc_count": 0,
            "include": [
              "Mango"
            ]
          }
        }
      }
    }
  }
}

As you can see in the result there is no bucket at all.

Result:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "titleIncluded": {
      "doc_count": 0,
      "titleSubAggregation": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": []
      }
    }
  }
}

It seems that in the second case no aggregation is performed because the query results in 0 documents.
Is this intended or is it a bug? For me it seems like an inconsistent behaviour when using include inside filter aggregations.

system · December 27, 2016, 12:25pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.