Difficulty understanding the "size" parameter in aggregations

I have a service where users can rate items from 1 to 5 and I need to find the top 40 rated items.

I'm searching from a type that contains posts made by users and it filters posts made from a specific country and that have a rating (1 to 5) for the item. I want to get the top 40 best rated items.

The schema looks like this:

post: [
    _all:                   [enabled: false],
    dynamic:                "strict",
    properties: [
        item:               [type: "object",
            properties: [
                id:         [type: "long"],
                name:       [type: "string", index: "no"],
                ...
            ]
       ],
       country_id:         [type: "long"],
       rating:             [type: "float"],
       ...
   ]
]

And the query looks like this

{
    query: {
        filtered: {
            filter: {
                bool: {
                    must: [
                        { term: { country_id: 73 } }
                    ],
                    must_not: [
                        {
                            missing: { field: "rating" }
                        }
                    ]
                }
            }
        }
    },
    size: 0,
    aggs: {
        item: {
            terms: {
                field: "item.id",
                min_doc_count: 5,
                size: 40,
                order: {avg_rating : "desc"}
            },
            aggs: {
                avg_rating: {
                    avg: { field: "rating" }
                }
            }
        }
    }
}

The problem I'm having is related to the "size" parameter in the aggregation. I want to get 40 results but I can't seem to achieve that. A year ago when I first wrote this and we have much fewer posts I got a bunch of results and I naively assumed that there were 40 of them. Later as the amount of posts grew this query started returning fewer and fewer results. We are not talking about very many posts (> 70k), but still this query currently returns 0 results. My initial ugly hack was to set the size to 100 and then pick the 40 first ones in the code, but that doesn't work anymore as I get only 2 results when the size is set to 100. I really don't want to continue hacking this without understanding what is wrong here. I want to say 40 in the query and actually get 40 results back. What am I missing here? How does the size parameter work in aggregations?

I suspect the issue is due to the min_doc_count parameter. Do you get 40 results if you remove it from your aggregation?

Yes I do, but I still don't understand why. I would understand it if I actually had so few records that there simply wouldn't be 40 items that had at least 5 posts about them. But that doesn't seem to be the reason. If I increase the size to 400 or something I start getting more results.

I'm calculating a crude average rating from the posts and I don't want single posts with rating 5 to pop up at the top. I would have done a proper weighted average, but that requires scripting and I'd like to keep dynamic scripts disabled. I'm kind of surprised that elasticsearch doesn't support weighted averages out of the box without scripting. Anyway I wouldn't want to remove the min_doc_count parameter from my query because then all I get is 40 items that have a single rating of 5 because the average calculation is distorted.

The reason is that shards will give back their top terms based on the value of the average rating, then the coordinating node will merge info from individual shards, and only then apply min_doc_count. Except that if most your top terms had a min_doc_count that was too low, this will cut down the number of returned terms significantly.

Yes I suspected it was something like that, but I'm still a bit confused since I just ran into this issue while developing this query in my local machine. It has only one node in the cluster and I'm querying a single index. How can there be more than one shard involved in this case?

Elasticsearch always sets up 5 shards by default, even if there is a single node, so that it can split the work to more machines if you were to add more machines. For instance you can configure your index to have exactly one shard and you shouldn't hit this issue anymore (but it will make it harder to scale).

That's what I've done in my local machine. I have this in my elasticsearch.yml:

index.number_of_shards: 1
index.number_of_replicas: 0

I'm not thinking about scaling in this case since it's just my local development cluster. My production cluster has 3 nodes and the default 5 shards.

Oops sorry I was wrong, you would still see the issue indeed as the shard will not send all its data to the coordinating node.

Do you happen to have many unique product ids? If yes, one way to fix the problem would be to toute documents based on this product id at index time (so that all ratings for a given product are on the same shard) and then to configure shard_min_doc_count=5 (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) in your terms aggregation as well.

There are about 15k of them but the number is growing. I have denormalized my data a little bit so that each post includes the id and the name of the product (item) being rated. Anyway I think all my ratings should already be in the same shard since I only have one shard and I'm querying one index. All I need to get back from the query is the ids of the items and I will then query another index that contains the the product data.