Sub aggregating top_hits

Hi, I've the below Query

{
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "is_deleted": 0
                    }
                },
                {
                    "term": {
                        "project.is_deleted": 0
                    }
                },
                {
                    "terms": {
                        "project.pid": [
                            1831
                        ]
                    }
                }
            ]
        }
    },
    "size": 0,
    "aggs": {
        "aq_agg_test_case": {
            "nested": {
                "path": "test_case"
            },
            "aggs": {
                "aq_agg_name": {
                    "terms": {
                        "field": "test_case.pid"
                    },
                    "aggs": {
                        "aq_agg_top_hits": {
                            "top_hits": {
                                "sort": [
                                    {
                                        "test_case.start_timestamp": "desc"
                                    }
                                ],
                                "size": 1
                            },
                            "aggs": {
                                "aq_agg_reverse_2": {
                                    "reverse_nested": {},
                                    "aggs": {
                                        "aq_agg_app_env": {
                                            "terms": {
                                                "field": "app_environment.name"
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

Getting error response:

{
    "error": {
        "root_cause": [
            {
                "type": "aggregation_initialization_exception",
                "reason": "Aggregator [aq_agg_top_hits] of type [top_hits] cannot accept sub-aggregations"
            }
        ],
        "type": "aggregation_initialization_exception",
        "reason": "Aggregator [aq_agg_top_hits] of type [top_hits] cannot accept sub-aggregations"
    },
    "status": 500
}

The requirement is: Group the docs by ID's, then take the latest doc based on start_timestamp[Date field] in each group. Further aggregate those single docs based on "app_env" field.

Can you help me understanding how take on this problem

Hi @Vivek_Burman

Did you try like this:

{
  "size": 0,
  "aggs": {
    "aq_agg_test_case": {
      "nested": {
        "path": "test_case"
      },
      "aggs": {
        "aq_agg_name": {
          "terms": {
            "field": "test_case.pid"
          },
          "aggs": {
            "aq_agg_top_hits": {
              "top_hits": {
                "sort": [
                  {
                    "test_case.start_timestamp": "desc"
                  }
                ],
                "size": 1
              }
            },
            "test_case_to_app_env": {
              "reverse_nested": {},
              "aggs": {
                "aq_agg_app_env": {
                  "terms": {
                    "field": "app_environment.name"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

I think this solution, maintains two parallel aggregations. One for top_hits and other for app_env.
While I've a slight different requirement here which is collect all those first top_hits docs and restart agg them based on app_env.

For the exception we know that it is not possible to have a sub aggs after the top_hits. I don't know, but maybe you add it first via app_env.

I tried pipeline aggregation as well but looks like the freedom is restricted on pipeline aggregation to re-aggregate the previous. But anyways I would love to know the reason behind this restriction in first place and hopefully it can be removed in future. We have couple of cases where we are only interested in the latest execution of jobs as we have couple of executions each day.

Doing analytics on the last-known-state of an entity is a classic use case for the transform api. See How to use transforms to track your most recent customer orders | Elastic Blog

Hi @Mark_Harwood1 thanks for sharing the article, I'm trying the pivot with painless script route, as it fits my needs. But have issue debugging it. Here is my query

{
    "query": {
        "bool": {
            "filter": [
                {
                    "term": {
                        "is_deleted": 0
                    }
                },
                {
                    "term": {
                        "project.is_deleted": 0
                    }
                },
                {
                    "terms": {
                        "project.pid": [
                            1831
                        ]
                    }
                }
            ]
        }
    },
    "size": 0,
    "aggs": {
        "aq_agg_by_nested": {
            "nested": {
                "path": "test_case"
            },
            "aggs": {
                "aq_agg_by_pid": {
                    "terms": {
                        "field": "test_case.pid"
                    },
                    "aggregations": {
                        "latest_doc": {
                            "scripted_metric": {
                                "init_script": "state.highest_pid=0;state.last_doc=null",
                                "map_script": {
                                    "source": "if(doc['pid'].size() > 0 && doc['pid'].value>state.highest_pid){state.highest_pid=doc['pid'].value;state.last_doc=doc;}"
                                },
                                "combine_script": {
                                    "source": "return state"
                                },
                                "reduce_script": {
                                    "source": "return states"
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

Two issues:
First seems like doc['pid'].size() > 0 is always false. Although there are results in the query.
Second, I've a nested field named "test_case" which is an array of objects I wish to iterate over properties inside the array tried using for(t in doc['test_case'){...} getting error test_case is field has no mapping, although /_mapping api shows it as nested_property

Adding the response of above query here:

{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 10000,
            "relation": "gte"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "aq_agg_by_nested": {
            "doc_count": 454,
            "aq_agg_by_pid": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 68,
                "buckets": [
                    {
                        "key": 12895,
                        "doc_count": 126,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    },
                    {
                        "key": 13417,
                        "doc_count": 53,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    },
                    {
                        "key": 13101,
                        "doc_count": 50,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    },
                    {
                        "key": 13102,
                        "doc_count": 33,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    },
                    {
                        "key": 13426,
                        "doc_count": 25,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    },
                    {
                        "key": 13394,
                        "doc_count": 23,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    },
                    {
                        "key": 13405,
                        "doc_count": 23,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    },
                    {
                        "key": 13395,
                        "doc_count": 22,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    },
                    {
                        "key": 13408,
                        "doc_count": 16,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    },
                    {
                        "key": 13164,
                        "doc_count": 15,
                        "latest_doc": {
                            "value": [
                                {
                                    "highest_pid": 0,
                                    "last_doc": null
                                }
                            ]
                        }
                    }
                ]
            }
        }
    }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.