Elasticsearch to recommand book authors: how to limit maximum 3 books per author?


I use Elasticsearch to recommend authors (my Elasticsearch documents
represent books, with a title, a summary and a list of author ids).

The user queries my index with some text (e.g. Georgia or Paris) and I need
to aggregate the score of individual books at the author level (meaning:
recommand an author that writes about Paris).

I began with a simple aggregation, however, experimentally
(cross-validation) it is best to stop aggregating the score of a user after
maximum 4 books. Let me explain in pseudocode:

the aggregated score of each author

Map<Author, Double> author_scores = new Map()

the number of books (hits) that contributed to each author

Map<Author, Integer> author_cnt = new Map()

iterate ES query results

for Document doc in hits:

# stop aggregating if more that 4 books from this author have already been found
if (author_cnt.get(doc.author_id) < 4):
    author_scores.increment_by(doc.author_id, doc.score)
    author_cnt.increment_by(doc.author_id, 1)

the_result = author_scores.sort_map_by_value(reverse=true)

So far, I have implemented the above aggregation in custom application
code, but I was wondering if it was possible to rewrite it using
Elasticsearch's query DSL or
org.elasticsearch.search.aggregations.Aggregator interface.

(crossposted from SO: http://stackoverflow.com/q/26360859/125617)

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7b6155ee-348b-4b4e-a910-3b30ff5c64bd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.