Take top n Documents per buckets and do further sub-aggregation

(janek sendrowski) #1


I like to take the best n Documents per user which is stored as user_id in
my index. This wouldn't be a problem until now. It could be done like this:


But now I like to do a sub-aggregation on it to calculate some expensive
scoring and this isn't possible anymore, because top_hits is a metric


My scoring algorithm is very expensive, so I can't apply it on the full
document set per user which is returned by the query

I also can't use the rescore feature which provides a window parameter,
because I first have to bucket the documents per user and then take the
best n docs per user.

The range query would work, but the scoring aren't comparable because of
the IDF. So I can't define a fixed range.

So I either have to make the scoring results comparable, which would be
simple, but the constant_score query doesn't work with the match query
which I am using or I have to find a way to reduce the bucket size to a
certain limit while ordering by relevance.

I'm trying since days to find a way to do that, but it seems that it's not

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/197f68ab-8de4-445a-a7b8-d9b865370540%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

(system) #2