Missing documents when using partitions on a term aggregation

Hi all,

I have a relatively small index (about 5M docs) and each document has an organization_id field. I need to list all the unique values for this field and I've implemented this using partitions by following this:

https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket-terms-aggregation.html#_filtering_values_with_partitions

It works in most cases.. but I managed to find a few cases where the number of results returned does not sum up to the same amount when changing the number of partitions.

I know my query should return about 19000 values. If I do a single query with only 1 partition or with 5 partitions, I am getting more results then if I do 20 queries with 20 partitions. The missing docs (484 of them) are consistent as they are always the same.

I've double-checked to make sure I was not skipping or missing a partition so I am at loss to explain this behaviour. I am using AWS ElasticSearch 6.2.

any help/comment is welcome

Did you check the doc count error bounds returned in results was zero? A partition size that is too small can have inaccuracies which are reported.

If you’re not looking for any particular sort order to the results the ‘composite’ aggregation may be a simpler way to walk through the list.

I will check for the error count to be sure but I believe they were all at zero.

I will have a look at the composite aggregation.

Thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.