When aggregated by terms the value is incorrect


(Jasmine) #1

I am trying to create a visualization and found the aggregated value is different when split by terms.
The query is like this.

{
  "size": 0,
  "query": {
    "query_string": {
            "query": "*"
          }
  },
  "aggs": {
    "2": {
      "terms": {
        "field": "store_nbr",
        "size": 30,
        "order": {
          "1": "desc"
        }
      },
      "aggs": {
        "1": {
          "sum": {
            "field": "amt"
          }
        }
      }
    }
  }
}

And one of the return amt value for store_nbr is

 {
      "1": {
        "value": 16953.470004558563
      },
      "key": 5408,
      "doc_count": 14
    }

}

But when I query only for this 5408, it gives me
`

{
      "1": {
        "value": 21818.200009822845
      },
      "key": 5408,
      "doc_count": 51
    }

The number is off. It looks like the doc_count is also different. And I also found "doc_count_error_upper_bound" from the response is -1 with the terms.

How is this happened? And is there any way to solve this?

Thanks!!


(Brandon Kobel) #2

Hey @jasmine.liang, the doc counts for terms aggregations are approximate per https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-approximate-counts


(Jasmine) #3

Thanks Brandon!

This document really explain what happened!

Then is there any way to solve this by enlarge memory or shard size temporally? Like anything to put into advanced JSON place to show more accurate results?


(Brandon Kobel) #4

Hey @jasmine.liang, you can use the advanced JSON similar to the following to adjust the "shard_size" discussed here.

{
"shard_size": 10
}

(Jasmine) #5

Thanks Brandon!. This really helps!


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.