Terms aggregation shows up irrevelant data

Hi,

I've a weird behaviour, using terms aggregation on integer field, with ES 6 (migrated from 5.x) on Debian 9.

Here is a first request I do, in order to assert that I do not have any data > 60:

{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "media_id": "aaa"
          }
        },
        {
          "range": {
            "eng.visu": {
              "gte": 60
            }
          }
        }
      ]
    }
  },
  "size": 9999
} 

Result is as expected:

{
"took": 483,
"timed_out": false,
"_shards": {
  "total": 5,
  "successful": 5,
  "skipped": 0,
  "failed": 0
},
"hits": {
  "total": 0,
  "max_score": null,
  "hits": []
}

}

But then, I do a terms aggregation on those data:

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "media_id": "aaa"
          }
        }
      ]
    }
  },
  "aggs": {
    "__all__": {
      "terms": {
        "field": "eng.visu",
        "size": 9999
      }
    }
  },
  "size": 0
}

And the result:

{
"took": 24,
"timed_out": false,
"_shards": {
  "total": 5,
  "successful": 5,
  "skipped": 0,
  "failed": 0
},
"hits": {
  "total": 18670,
  "max_score": 0,
  "hits": []
},
"aggregations": {
    "__all__": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 1,
          "doc_count": 690
        },
        {
          "key": 0,
          "doc_count": 674
        },
        {
          "key": 2,
          "doc_count": 655
        },
        ...
       {
          "key": 80,
          "doc_count": 298
       },
      {
          "key": 82,
          "doc_count": 298
       },
       ...
       {
          "key": 5276,
          "doc_count": 1
        }
      ]
   }
}
}

As you can see, I have keys that are really greater than 60.

Has someone already seen this behaviour, and have a clue to fix it ?

Thanks for your help.

What does the entire response look like? (including total hits)

Hi, thank your for your answer. I've updated the original post to include full content of both responses.

This looks wrong indeed. Is it 100% reproducible? Can you find the document that has 5276 as a value for eng.visu somehow?

yes, there are documents with the value 5276 (~14K), but none with the media_id filtered on "aaa".

Updated to 6.1.1 and still have the wrong behaviour.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.