Top N documents from top_hits, rather than top N per bucket


(Matt Preston) #1

I'm using a top_hits aggregation for a field collapsing. e.g.

"aggregations": {
    "top-groups" : {
      "terms": {
        "field": "group-field",
        "size": 10,
        "order": {
          "max_score": "desc"
        }
      },
      "aggregations": {
        "top_hit": {
          "top_hits": {
            "size": 5
          }
        },
        "max_score": {
          "max": {
            "script": "_score"
          }
        }
      }  
    }
  }

The only way to specify the number of results returned is by using the size attributes of the term and top_hit aggregations. The above example will return up to 10 * 5 results, depending on what was matched by the query.

Is there any way to specify the absolute maximum number of results? e.g. If I want exactly 50 results in total, the above configuration will not work if any of the groups contain less than 5 results. The only way to ensure that I get at least 50 results is to set the size of the terms aggregation to 50, but then in the worst case the response will contain 50 * 5 documents, which leads to poor performance.

Thanks,
Matt


(system) #2