Terms aggregation on an inner object and retrieving bucket metadata

We index the following products:

{
    "id": "1",
    "name": "the-name",
    "categories": [
        {
            "id" : 10,
            "name" : "cat-1"
        },
        {
            "id" : 20,
            "name" : "cat-2"
        }
    ]
}

We are doing an aggregation on categories.id using :

REQUEST:
//...
"aggs": {
    "by_cat": {
        "terms": {
            "field": "categories.id",
            "size": 10
        }
    }
}

---
RESPONSE:
// ...
"by_cat" : {
    "buckets" : [
    {
        "key" : 10,
        "doc_count" : 804
    },
    {
        "key" : 20,
        "doc_count" : 327
    },

It works well, however, each bucket contains only the categories.id in the key field. What we would like is to be able to have the name of the category in the bucket, for example :

// ...
      "buckets" : [
        {
          "key" : 10,
          "metadata": {
              "name": "cat-1"
          },
          "doc_count" : 804
        },
        {
          "key" : 20,
          "metadata": {
              "name": "cat-1"
          },
          "doc_count" : 327
        },

What is the good way to do that ? We found two to get this information but they both looks "hackish" :

  • Using top_hits with size 1 and source limited to categories, it will retrieve one document per bucket containing the information we need. This first solution doesn't look performance-wise and the more aggregation we have, the more bloated is the response.
  • Adding a new column id_name which concatenate id and name and doing the term aggregation on it. It looks more like a hack, and may be complicated if many fields.

We also tried by mixing field and script in terms but it doesn't help.

metadata looked exactly what we wanted but it is global for all the buckets and not dynamic.

Do we have other way to retrieve this information ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.