Grouping by a field and getting documents with lowest value for a given property

Hello!

I am trying to find in an index that contains multiple documents related to courses, having a structure like:

[
   {
      "course_id":"ADE778",
      "subscriptions":91,
      "source":"page1.com"
   },
   {
      "course_id":"ADE778",
      "subscriptions":32,
      "source":"page2.com"
   },
   {
      "course_id":"ADE778",
      "subscriptions":14,
      "source":"page3.com"
   },
   {
      "course_id":"ADE778",
      "subscriptions":77,
      "source":"page4.com"
   },
   {
      "course_id":"ADE778",
      "subscriptions":92,
      "source":"page5.com"
   },
   {
      "course_id":"ADE778",
      "subscriptions":33,
      "source":"page6.com"
   },
   {
      "course_id":"45KFF2",
      "subscriptions":12,
      "source":"page7.com"
   },
   {
      "course_id":"45KFF2",
      "subscriptions":41,
      "source":"page8.com"
   },
   {
      "course_id":"45KFF2",
      "subscriptions":16,
      "source":"page9.com"
   },
   {
      "course_id":"45KFF2",
      "subscriptions":18,
      "source":"page10.com"
   },
   {
      "course_id":"45KFF2",
      "subscriptions":79,
      "source":"page11.com"
   },
   {
      "course_id":"45KFF2",
      "subscriptions":44,
      "source":"page12.com"
   }
]

What I was trying to get is getting for each course_id, the document with the highest subscriptions value, something like:

[
   {
      "course_id":"ADE778",
      "subscriptions":92,
      "source":"page5.com"
   },
   {
      "course_id":"45KFF2",
      "subscriptions":79,
      "source":"page11.com"
   }
]

I was trying with something like this query:

{
	"aggs": {
		"group_by_course_id": {
			"terms": {
				"field": "course_id.keyword"
			},
			"aggs": {
				"group_by_course_id": {
					"terms": {
						"field": "course_id.keyword"
					},
					"aggs": {
						"best_subscriptions": {
							"top_hits": {
								"size": 1,
								"sort": [
									{
										"subscriptions": {
											"order": "desc"
										}
									}
								]
							}
						}
					}
				}
			}
		}
	}
}

But I have almost 100K different course_id values, so I do not know what kind of aggregation should I be using.

Thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.