I am using the Value count aggregation on a fixed interval, (the shards data doesn't change within this interval). However, I am encountering varying results with each query execution. Is it possible that I am obtaining an approximate count instead of exact counts?
p.s. My data is distributed across multiple nodes and shards.
I expect that this is an approximate count. That's called out for a terms aggregation, see: Terms Agg doc count error. I'm surprised it's not called out for value_count in the docs too. I'll go digging and see if this is a gap in our docs.
@Atefeh after talking with the team who maintains the value_count agg, it seems my initial assumption was wrong, and it should be an exactly count - NOT approximate. It may be that you've found a bug, or that your cluster is having issues.
Can you share some more info?
what version of Elasticsearch are you using?
can you reproduce on another index? If so, can you provide the reproduction steps?
can you share the full output of your responses that differ, and show that there are not any partial shard failures?
can you check the Elasticsearch logs during the query and share if there are any errors or warnings?
Thank you for your attention. I am using Elasticsearch v 7.17.9.
You are right. There are some shard failures in the query response.
{
"type": "circuit_breaking_exception",
"reason": "[parent] Data too large, data for [indices:data/read/search[phase/query]] would be [35868302350/33.4gb], which is larger than the limit of [35701915648/33.2gb], real usage: [35868301824/33.4gb], new bytes reserved: [526/526b], usages [request=0/0b, fielddata=20312178761/18.9gb, in_flight_requests=1052/1kb, model_inference=0/0b, eql_sequence=0/0b, accounting=469451628/447.7mb]",
"bytes_wanted": 35868302350,
"bytes_limit": 35701915648,
"durability": "PERMANENT"
}
I also observed a GC message in the node where the shard failed. (GC did not bring memory usage down, before [35712048256], after [35712106128])
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.