Limit results by field + aggregations over results

In our index, documents may exist multiple times for the same id as we track each version. For our search, we only want to return one result for this field. And I know we can use aggregations to do a search that finds the top hit.

"aggs": {
  "latest": {
    "terms": {
      "field": "id",
      "size" : 0 
    },
    "aggs":{
      "top_hits_id": {
        "top_hits": {
          "size":1
        }
      }
    }
  }
}

However, we have other aggregations we want to calculate and return. For example, our documents have fields such as name, date, and network. And we want to provide aggregations on each of those (how many per name, group the date ones by month, and same with network).

From my understanding, the aggregation to get the latest creates a bunch of buckets. I'm not sure how to run these additional aggregations so they are not performed "per" latest, but as a sum across all results found.

For example, let's say we have the following documents:
1, Neesha, Nov 27, TW, "bla"
2, Neesha, Oct 31, TW, "yes"
3, Bob, Nov 23, FB, "bla"
4, Chris, Nov 3, TW, "test bla"
5, Chris, Sept 3, TW, "hi bla"
6, Chris, Sept 8, TW, "edited bla"
6, Chris, Sept 8, TW, "bla"

If the search was to find all that contain bla, max one per id, it would return the following documents:
1, Neesha, Nov 27, TW, "bla"
3, Bob, Nov 23, FB, "bla"
4, Chris, Nov 3, TW, "test bla"
5, Chris, Sept 3, TW, "hi bla"
6, Chris, Sept 8, TW, "edited bla"
Note, only one of 6, because we want only the latest.

And the following aggregations:
Names:
Neesha - 1
Bob - 1
Chris - 3
Dates:
Nov - 3
Sept - 2
Network
TW - 4
FB - 1

If I list these aggregations after the latest one above, it is not limited to just the filtered results and will count duplicates. If I put it in the latest, it then gives me aggregations for each document.

Is there a trick I'm missing? Or is this impossible?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.