Calculating metric based on certain buckets in a histogram aggregation

A simplified version of our price index is as follows:

PUT price-example

{

  "mappings": {

    "properties": {

      "skuId": {

        "type": "keyword"

      },

      "supplierId": {

        "type": "keyword"

      },

      "price": {

        "type": "float"

      },

      "pricedate": {

        "type": "date"

      }

    }

  }

}

The UI supports the following table definition

Supplier Name | Total Items | Mean | Low | High | Median Price Growth

The query we wrote initially is the following which I think meets all but the ‘% median price growth` in terms of the aggregate values:

GET price-example/_search

{

  "size": 0,

  "query": {

    "bool": {

      "filter": {

        "range": {

          "pricedate": {

            "gte": "2019-11-01",

            "lte": "2020-11-30"

          }

        }

      }

    }

  },

  "aggs": {

    "total-buckets": {

      "cardinality": {

        "field": "supplierid"

      }

    },

    "supplier-bucket": {

      "terms": {

        "field": "supplierid",

        "size": 10

      },

      "aggs": {

        "total-skus": {

          "cardinality": {

            "field": "skuId"

          }

        },

        "price-percentiles": {

          "percentiles": {

            "field": "price",

            "percents": [

              0,

              25,

              50,

              75,

              100

            ]

          }

        },

        "dates": {

          "date_histogram": {

            "field": "pricedate",

            "calendar_interval": "day"

          },

          "aggs": {

            "day-median": {

              "percentiles": {

                "field": "price",

                "percents": [

                  50

                ]

              }

            }

          }

        }

      }

    }

  }

}

The first challenge is how to calculate the ‘% median price growth’. It is calculated as a percent growth rate, taking the difference between the median prices at the start and end dates and then dividing by the median price for the start date.

So in this example we need the median price in the earliest and latest buckets in the ‘dates’ aggregation and then use these values to calculate the metric. I’m not sure how, or even if it is possible to do this. Another thing to bear in mind is that we have to be able to sort on this value too.

Any suggestions/ideas would be greatly appreciated

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.