Does the term aggregation also return approximate value when doing sum/avg aggregation?

Kramer_Li · February 21, 2019, 3:22am

I understand that term aggregation will return approximate result regarding the doc count.
But how about the metric aggregation after term aggregation? For example, sum/avg.
Will these also be approximate too?

Mark_Harwood · February 21, 2019, 10:06am

When the doc count is approximate (because of high-cardinality fields on distributed data) then any child aggregations (sum, avg etc) will be looking at an incomplete set of docs and so the values will be inaccurate.
To overcome this you can do an initial query to find the top terms and then run a second query with terms agg using include parameter that lists these values in an array and have the child aggs computing sum, avg etc. These values will be correct for those terms.

Kramer_Li · February 21, 2019, 12:21pm

The doc also said using "shard_size" will improve the accuracy, but did not giving any example for this parameter.
Do you know how to use the shard_size ？ Can you give some example please? Thanks

Mark_Harwood · February 21, 2019, 3:11pm

Here's an example:

GET myindex/_search
{
  "size": 0,
  "aggs": {
	"my_agg": {
	  "terms": {
		"field": "my_field",
		"size": 10,
		"shard_size": 100
	  }
	}
  }
}

Pay attention to the doc_count_error_upper_bound in the results - when that's zero you know you're accurate.

system · March 21, 2019, 3:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Term aggregation not return accurate number of records Elasticsearch	3	514	March 28, 2018
When aggregated by terms the value is incorrect Kibana	5	1349	November 7, 2018
Sum_other_doc_count higher than total docs Elasticsearch	3	1663	August 16, 2018
How can i improve accuracy of term aggregation? Kibana	4	3060	May 10, 2018
Accuracy of elastic search aggregation (sum) when number of unique values in greater than a million Elasticsearch	10	4404	June 6, 2019

Does the term aggregation also return approximate value when doing sum/avg aggregation?

Related topics