Terms aggregation shows up irrevelant data


#1

Hi,

I've a weird behaviour, using terms aggregation on integer field, with ES 6 (migrated from 5.x) on Debian 9.

Here is a first request I do, in order to assert that I do not have any data > 60:

{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "media_id": "aaa"
          }
        },
        {
          "range": {
            "eng.visu": {
              "gte": 60
            }
          }
        }
      ]
    }
  },
  "size": 9999
} 

Result is as expected:

{
"took": 483,
"timed_out": false,
"_shards": {
  "total": 5,
  "successful": 5,
  "skipped": 0,
  "failed": 0
},
"hits": {
  "total": 0,
  "max_score": null,
  "hits": []
}

}

But then, I do a terms aggregation on those data:

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "media_id": "aaa"
          }
        }
      ]
    }
  },
  "aggs": {
    "__all__": {
      "terms": {
        "field": "eng.visu",
        "size": 9999
      }
    }
  },
  "size": 0
}

And the result:

{
"took": 24,
"timed_out": false,
"_shards": {
  "total": 5,
  "successful": 5,
  "skipped": 0,
  "failed": 0
},
"hits": {
  "total": 18670,
  "max_score": 0,
  "hits": []
},
"aggregations": {
    "__all__": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 1,
          "doc_count": 690
        },
        {
          "key": 0,
          "doc_count": 674
        },
        {
          "key": 2,
          "doc_count": 655
        },
        ...
       {
          "key": 80,
          "doc_count": 298
       },
      {
          "key": 82,
          "doc_count": 298
       },
       ...
       {
          "key": 5276,
          "doc_count": 1
        }
      ]
   }
}
}

As you can see, I have keys that are really greater than 60.

Has someone already seen this behaviour, and have a clue to fix it ?

Thanks for your help.


(Adrien Grand) #2

What does the entire response look like? (including total hits)


#3

Hi, thank your for your answer. I've updated the original post to include full content of both responses.


(Adrien Grand) #4

This looks wrong indeed. Is it 100% reproducible? Can you find the document that has 5276 as a value for eng.visu somehow?


#5

yes, there are documents with the value 5276 (~14K), but none with the media_id filtered on "aaa".


#6

Updated to 6.1.1 and still have the wrong behaviour.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.