Take top n Documents per buckets and do further sub-aggregation

(janek sendrowski)


I like to take the best n Documents per user which is stored as user_id in
my index. This wouldn't be a problem until now. It could be done like this:


But now I like to do a sub-aggregation on it to calculate some expensive
scoring and this isn't possible anymore, because top_hits is a metric


My scoring algorithm is very expensive, so I can't apply it on the full
document set per user which is returned by the query

I also can't use the rescore feature which provides a window parameter,
because I first have to bucket the documents per user and then take the
best n docs per user.

The range query would work, but the scoring aren't comparable because of
the IDF. So I can't define a fixed range.

So I either have to make the scoring results comparable, which would be
simple, but the constant_score query doesn't work with the match query
which I am using or I have to find a way to reduce the bucket size to a
certain limit while ordering by relevance.

I'm trying since days to find a way to do that, but it seems that it's not

