Aggregated sorted search with top hits and paging - items missing

My intend is to have paged list of all categories presented by cheapest product, sorted by price. I have search terms aggregation by product category, sorted by min aggregation on price:

"aggs": {
        "count":{"cardinality":{"field":"category_id"}},
        "categories": {
            "terms": {
                "field": "category_id",
                "order": {
                    "sorting": "asc"
                },
                 "include": {
                    "num_partitions": 2,
                    "partition": 0
                },
                "size": 10
            }
        },
        "aggs": {
            "fields": {
                "top_hits": {
                    "size": 1,
                    "sort": [
                        {
                            "price": "asc"
                        }
                    ]
                }
            },
            "sorting": {
                "min": {
                    "field": "price"
                }
            }
        }
    },

There are 32 categories total. So far everything work, bud when I change the num_partitions to 3, some of categories witch was there, are suddenly missing. Namely the one, witch have cheapest product and was on first place when there was only 2 partitions.

I have no idea, what is going on there and why. Code itself is variation on Terms aggregation | Reference witch seams to be basic stuff.

Can someone please tell me, where the problem is and how to make it work as intended?

Thanks a lot.

@irkallacz first off welcome!

I’m not an expert on aggs. But my rough knowledge here and a quick review of the docs: Terms aggregation | Reference suggests that when you increase num_partitions you are bucketing your categories into further partitions but then by specifying partition you are asking for only those categories in the first or 0th partition. You’d need to make subsequent queries for the 1rst and 2nd partitions if you had set num_parititions to 3. So it makes sense to me that you report missing categories as you increase num_partitions without querying for the other partitionvalues. Let me know if that helps at all.

Hi. Partitions should only be used when there are very large numbers of unique values (maybe millions or more). 32 unique values should work fine without partitions.