Most are familiar with searching using apple spotlight, or other big sites with a drop-down as you type. Usually it shows aggregated types, with a top-N per type arrangement. So for example if you search "Widgets", it might show
Maybe but you know I posted this AFTER reading the whole aggregation chapter and also using aggregations to count the hits per type, for my current query.
I'm probably not thinking right but did not see an obvious way within aggregations to command sorting by scores and taking the top N, several times over - where each chunk is selected by type (or by anything else for that matter).
For example, forget types, if you have documents with a field "color" and a field "note" and some text. Can one query retrieve the top 5 blue, top 5 pink and top 5 yellow documents that match "query string", ordered by score within each color group? what would the query for that be?
Coming in 2.0 is a "sampler" aggregation that allows you to perform analytics on a sample of the top-scoring docs.
One of the options is a "diversity" setting that limits the number of results from any one source [1].
By nesting a top_hits aggregation under the sampler you can get some way towards getting diversified search results. There are some caveats like a lack of paging support so please read the docs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.