Can we use sub aggregation after top metric aggregation

Fiza · April 27, 2023, 1:34pm

I want to get sum of latest value of a field . With several conditions which is applied over other fields.
I am working on time series data and want to get information from last poled value.
Example Data:-

A	B	C
q1	1	2343
q1	2	53
q1	3	-1
q1	4	-1
q2	1	-1
q2	2	435
q2	3	543
q2	4	-1

Here the condition is : C > -1. For each unique A get each unique B.
Here the result should be q1- 1, q1-2, q2-2, q2-3.
Also the data got from last polling should be used and shouldn't change with time filter.

So , the query I am writing is

{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "C": {
              "gt": -1
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "time": {
      "top_metrics": {
        "metrics": {
          "field": "@timestamp"
        },
        "sort": {
          "@timestamp": "desc"
        }
      },
      "aggs": {
        "uniqueA": {
          "terms": {
            "field": "A"
          },
          "aggs": {
            "uniqueB": {
              "terms": {
                "field": "B"
              }
            }
          }
        }
      }
    }
  },
  "size": 0
}

This code is returning latest timestamp and no further aggregation is working.

Can anyone help on this? Can we use sub-aggregations after top_metric ? If not is there any way to achieve solution of this problem statement ?

Ignacio_Vera · April 28, 2023, 10:53am

You cannot have a sub-aggregations under a top metrics aggregation. It is a bug that you don't get an error in this case, i have opened: top_hits aggregation should fail if it contains sub aggregations · Issue #95663 · elastic/elasticsearch · GitHub

Fiza · April 28, 2023, 12:47pm

Thanks @Ignacio_Vera for replying.

Sunile_Manjee · April 28, 2023, 3:09pm

Strictly based on what you provided I ran a quick test (unless I misunderstood)

PUT /sunman/_mapping
{
    "properties": {
      "a": {
        "type": "keyword"
      },
      "b": {
        "type": "integer"
      },
      "c": {
        "type": "integer"
      }
    }
}

and then i added your docs to the index. Ran the following

GET /sunman/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": {
        "range": {
          "c": {
            "gt": -1
          }
        }
      }
    }
  },
  "aggs": {
    "group_by_a_and_b": {
      "terms": {
        "script": "doc['a'].value + '_' + doc['b'].value"
      }
    }
  }
}

and it seems to return exactly what you wanted

{
  "took": 321,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 4,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "group_by_a_and_b": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "q1_1",
          "doc_count": 1
        },
        {
          "key": "q1_2",
          "doc_count": 1
        },
        {
          "key": "q2_2",
          "doc_count": 1
        },
        {
          "key": "q2_3",
          "doc_count": 1
        }
      ]
    }
  }
}

Fiza · May 1, 2023, 7:13am

Thanks @Sunile_Manjee . I tried over my code and it worked.
Instead doing like this I was looking other way round.
It helped a lot.

Fiza · May 1, 2023, 7:35am

@Sunile_Manjee but still one issue is there. I am getting the result but my data is time series data. I want only last pole to be taken into consideration to perform aggregation. It's changing if I change time interval.

Sunile_Manjee · May 2, 2023, 7:31pm

@Fiza please provide me an example of the output you are expecting based on the inputs you provided earlier.

Fiza · May 3, 2023, 8:53am

A	B	C
q1	1	2343
q1	2	53
q1	3	-1
q1	4	-1
q2	1	-1
q2	2	435
q2	3	543
q2	4	-1

Suppose this is the data received from the last pole @timestamp A.

There are also previous data (of same format) which I am storing in my database. (Like at timestamp A-2m, A-4m etc).
Example of Data from previous pole:-

A	B	C
q1	1	2343
q1	2	23
q1	3	56
q1	4	-1
q2	1	-1
q2	2	435
q2	3	543
q2	4	-1

So here the previous poled data is different from latest pole data.

I want my visualization to check for the latest pole data and then put conditions accordingly.
The condition is : C > -1. For each unique A get each unique B.
Here the result should be q1- 1, q1-2, q2-2, q2-3.

Topic		Replies	Views
Sub aggregating top_hits Elasticsearch	7	352	February 20, 2024
Aggregation on metric (sum) aggregation Elasticsearch	8	7022	July 6, 2017
Alternate for sub aggregation in elastic search Elasticsearch	5	383	July 12, 2022
Using sub-aggregation result in parent Elasticsearch	2	578	July 5, 2017
Get top hits aggregation without aggregating on all values Elasticsearch	1	389	November 7, 2018

Can we use sub aggregation after top metric aggregation

Related topics