Return _source fields within aggregation

Kornelia_Watson · November 7, 2017, 2:15pm

Hi all,

In my ES indices I have an order type, which as you can guess stores various data about orders.
What I need to do is create a query such that: it splits data using scripted field, and then sums up order values within created buckets, as well as returns information such as currency, account id, channel the order came from etc which would be in _source...

I can prepare this query through Kibana visualisation but it ends up having a lot of unnecessary aggregations. I tried to use top_hits but with no success... I wondered, are there limitations to what type of aggregations top_hits can be used or is it just most likely me writing the query incorrectly?

Thanks,
Kornelia

jimczi · November 8, 2017, 11:33am

Can you share the aggregation that you tried ? The top_hits aggregation can be used as a root aggregation or under a multi_bucket aggregation (terms for instance).

Kornelia_Watson · November 9, 2017, 9:08am

Might be easier to maybe show the query that Kibana suggests, which is:

     {
        "query": {
          "bool": {
            "must": [
              {
                "query_string": {
                  "analyze_wildcard": true,
                  "query": "some query"
                }
              },
              {
                "range": {
                  "date": {
                    "gte": 1447058731436,
                    "lte": 1510217131436,
                    "format": "epoch_millis"
                  }
                }
              }
            ]
          }
        },
        "size": 0,
        "aggs": {
          "id": {
            "terms": {
              "script": {
                "inline": "some scripted field",
                "lang": "painless"
              },
              "size": 500000,
              "order": {
                "_count": "desc"
              },
              "value_type": "string"
            },
            "aggs": {
              "account_id": {
                "terms": {
                  "field": "account_id",
                  "size": 1,
                  "order": {
                    "_count": "desc"
                  }
                },
                "aggs": {
                  "channel": {
                    "terms": {
                      "field": "channel_description",
                      "size": 1,
                      "order": {
                        "_count": "desc"
                      }
                    },
                    "aggs": {
                      "base_currency": {
                        "terms": {
                          "field": "base_currency",
                          "size": 1,
                          "order": {
                            "_count": "desc"
                          }
                        },
                        "aggs": {
                          "gmv": {
                            "sum": {
                              "field": "value_base"
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }

And the output is

{
        "aggregations": {
          "id": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "acc-channel-10-2017",
                "doc_count": 2670,
                "account_id": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                    {
                      "key": "some account",
                      "doc_count": 2670,
                      "channel": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                          {
                            "key": "some channel",
                            "doc_count": 2670,
                            "base_currency": {
                              "doc_count_error_upper_bound": 0,
                              "sum_other_doc_count": 0,
                              "buckets": [
                                {
                                  "key": "some currency",
                                  "doc_count": 2670,
                                  "gmv": {
                                    "value": some gmv value
                                  }
                                }
                              ]
                            }
                          }
                        ]
                      }
                    }
                  ]
                }
              }
            ]
          }
        }
      }

So there are a lot of subbuckets. What I wondered is if after performing two aggregations (to retrieve id and gmv) I can then somehow access other values like account_id, channel and base_currency without having to do all those subaggregations

jimczi · November 9, 2017, 10:47pm

The aggregation named "gmv" in your example is a sum aggregation. It is a metric aggregation which in your case computes a sum of the field value_base for every id.account_id.channel. base_currency bucket created by the tree of terms aggregation at the upper levels. It is a single value that is computed based on the documents that are contained in the bucket. The value for account_id, channel and base_currency are already in the response. The values are the key of the parent buckets above each gmv result.
The top_hits aggregation returns the top N documents per bucket. It is not a single valued metric like the sum aggregation and is not needed in your case.

system · December 7, 2017, 10:47pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can i use _source field inside aggregations? Elasticsearch	2	2554	March 14, 2017
SUM top hit aggegration? Elasticsearch	4	3079	November 27, 2018
[Kibana 5.5.2] condition in aggregated field with top_hits Kibana	3	528	February 8, 2018
Help in visualizing custom query/aggregation Kibana	8	647	February 22, 2021
Filter aggregation buckets by top hits scripted field Elasticsearch	1	578	July 5, 2021

Return _source fields within aggregation

Related topics