Inaccurate sum aggregation results

Hi all,
I index lots of documents in batch mode, indexing ~50M docs.
Since it took too long (200K/hour) while running on a 5 data nodes cluster, I enlarged the cluster with 2 more data nodes. I was watching while the data "spread" on the new nodes, and all shards re-allocated properly.

When validating the data, I found out that while all of the docs are in the cluster, I fail to get the proper sum results - the numbers are lower.

  • I am doing sum aggregation, and I filter by given terms.
  • I get different values between different aggregations - when I aggregate by terms and sum the value, I get 3 doc_counts for this given term. When I filter this term only, I get 4 doc_counts. It means that for some reason 1 document is kept out of the aggregation.
  • When I get the 3 doc_counts, I get doc_count_error_upper_bound=-1.
  • in the _shards I always get high number of "skipped". I think it just means those shards supposedly does not have relevant docs, but I am not 100% about that.

I am working with elasticsearch for long time, and never encountered this problem.
I am on 6.2.

Please advise,
Shushu

Can you supply examples of the 2 JSON queries where you see a discrepancy between them?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.