Sum aggregated values

Hello. I'm trying to collect some statistics data from my docs, but I'm struggling a bit.
My docs looks like this:

{
  "productId": "156"
  "price": "101.23"
  "discount": "2.23"
  "marketplace": "some_marketplace"
  "categoryId": "256"
  "brandId": "356"
  "quantity": "10"
  "timestamp": "1622435597"
}

Several docs for each product per day.

I want to get date histogram (per day or per month) for given category and calculate avg product price in this category and get the sum of all product's avg quantities in category.
For example, if I have two products (several docs for each) and their avg price and avg quantity are - 100, 10 for first and 150, 20 for second, then i want to get avg price equals to 125 and sum quantity equals to 30.
My query looks like this

POST products/_search
{
  "size": 0, 
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "categoryId": "256"
          }
        }
      ], 
      "filter": [
        {
          "range": {
            "timestamp": {
              "gte": "1622037600",
              "lte": "1622246399"
            }
          }
        }
      ]
    } 
  },
  "aggs": {
    "group_day": {
      "date_histogram": {
        "field": "timestamp",
        "calendar_interval": "day",
        "format": "yyyy-MM-dd"
      },
      "aggs": {
        "avg_weight_price": {
          "weighted_avg": {
            "value": {
              "field": "price"
            },
            "weight": {
              "field": "quantity"
            }
          }
        },
        "avg_weight_discount": {
          "weighted_avg": {
            "value": {
              "field": "discount"
            },
            "weight": {
              "field": "quantity"
            }
          }
        },
        "group_by_product": {
          "terms": {
            "field": "productId"
          },
          "aggs": {
            "avg_quantity": {
              "avg": {
                "field": "quantity"
              }
            }
          }
        }
      }
    }
  }
}

The result is:

"aggregations" : {
    "group_day" : {
      "buckets" : [
        {
          "key_as_string" : "2021-05-27",
          "key" : 1622073600000,
          "doc_count" : 18,
          "avg_weight_discount" : {
            "value" : 20.23
          },
          "group_by_product" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "100",
                "doc_count" : 6,
                "avg_quantity" : {
                  "value" : 9.5
                }
              },
              {
                "key" : "101",
                "doc_count" : 6,
                "avg_quantity" : {
                  "value" : 9.5
                }
              },
              {
                "key" : "102",
                "doc_count" : 6,
                "avg_quantity" : {
                  "value" : 9.5
                }
              }
            ]
          },
          "avg_weight_price" : {
            "value" : 130.20309941520466
          }
        },
        {
          "key_as_string" : "2021-05-28",
          "key" : 1622160000000,
          "doc_count" : 6,
          "avg_weight_discount" : {
            "value" : 2.23
          },
          "group_by_product" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "100",
                "doc_count" : 6,
                "avg_quantity" : {
                  "value" : 9.5
                }
              }
            ]
          },
          "avg_weight_price" : {
            "value" : 101.23
          }
        }
      ]
    }
  }

So I get here avg product price for given category as I want, but I'm struggling with getting sum of avg products quantities. All I can do now - get avg quantity for each product in category per day. I don't want that info in result, instead of that I want just one value - sum of all avg products quantity.

Thank you for reading this. Any help would be very appreciated

Have a look at the sum bucket pipeline aggregation.

You basically need 2 passes for this, if the data is small enough pipeline aggregations can be used, for larger amounts of data you can use a transform to pre-aggregate the data and run 2nd level analysis. This might also be useful, if you want to do the analysis more than once.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.