Sorting aggregated results


(James) #1

I'm a noob using 1.5.2. I'm doing a query on a set of documents that has a receiveddate and partno - I want my results to be sorted descending by received date with only 1 document per partnno (I have documents that have the same partno, but they also have the save receiveddate so I just want one of them). The partno will match exactly, so I'm doing a constant_score filter with a query.

I use a terms aggregator and a top-hips sub aggregation to just get a single document per partno.

I typical search will match 1000s of documents, but I only want the 50 most recent (sorted by "receiveddate" in decending order). When I scroll through my 50 results (I'm actually using the buckets in my aggregator), they are in order by receiveddate, HOWEVER, they are not the 50 OVERALL documents with the most recent received date.

I assume I need to pull enough documents off each shard to reliably get the most recent 50 after sorting? I've tried sorting aggregations and results (as seen below), but no luck. I've also been looking into how to make receiveddate impact relevancy - since all results will be a perfect match (but no luck yet on that either).

Can someone explain the correct way to bring back the top 50 results sorted by receiveddate? Thanks!

{
"size" : 50,
"query" : {
     "constant_score" :  {
         "filter" : {
              "query" : {
               ....
}}}},
"sort" :  [{
  "receiveddate" : "desc"}],
"aggregations" :  {
  "parts" : {
    "terms" : {
       "field" : "partno",
       "size" : 50
     },
   "aggregations" : {
    "top_hits" : {
       "top_hits" : {
          "size" : 1,
          "sort" : [{
            "receiveddate" : "desc"
          ]}
       }
   }}}}}}}

(system) #2