I use Elasticsearch to recommend authors (my Elasticsearch documents
represent books, with a title, a summary and a list of author ids).
The user queries my index with some text (e.g. Georgia or Paris) and I need
to aggregate the score of individual books at the author level (meaning:
recommand an author that writes about Paris).
I began with a simple aggregation, however, experimentally
(cross-validation) it is best to stop aggregating the score of a user after
maximum 4 books. Let me explain in pseudocode:
the aggregated score of each author
Map<Author, Double> author_scores = new Map()
the number of books (hits) that contributed to each author
Map<Author, Integer> author_cnt = new Map()
iterate ES query results
for Document doc in hits:
# stop aggregating if more that 4 books from this author have already been found
if (author_cnt.get(doc.author_id) < 4):
author_scores.increment_by(doc.author_id, doc.score)
author_cnt.increment_by(doc.author_id, 1)
the_result = author_scores.sort_map_by_value(reverse=true)
So far, I have implemented the above aggregation in custom application
code, but I was wondering if it was possible to rewrite it using
Elasticsearch's query DSL or
org.elasticsearch.search.aggregations.Aggregator interface.
(crossposted from SO: http://stackoverflow.com/q/26360859/125617)
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7b6155ee-348b-4b4e-a910-3b30ff5c64bd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.