Post-processing aggregations data

haizaar · May 25, 2015, 5:56pm

Good day,

I have a (simplified) mapping that stores throughput metrics every second:

{  "sample": {
        "properties": {
            "timestamp": {  "type": "date" },
            "throughput": {  "type": "long" }
        }
}   }

I would like to calculate average throughput in megabytes per second over 1-minute buckets.

So far I found two ways of doing this:

1. Sum aggregation with post-processing:

{  "aggs": {
    "date_agg": {
      "date_range": {
        "field": "timestamp",
        "ranges": [
          { "from": "2015-05-25T14:50:00.000Z", "to": "2015-05-25T14:51:00.000Z" }
          .... <several hundred more buckets>
        ]
      },
      "aggs": {
        "total_throughput": {
          "sum": { "field": "timestamp" }
} } } } }

And then divide value of each bucket to (6010241024) on the client side after fetching the data from ES.

2. With a scripted metric:

{  "aggs": {
    "date_agg": {
      "date_range": {
        "field": "timestamp",
        "ranges": [
          { "from": "2015-05-25T14:50:00.000Z", "to": "2015-05-25T14:51:00.000Z" }
          .... <several hundred more buckets>
        ]
      },
      "aggs": {
        "thoughput_per_sec": {
          "scripted_metric": {
            "init_script": "_agg['tp'] = 0",
            "map_script": "_agg.tp += doc['throughput'].value",
            "reduce_script": "tps = 0; for (a in _aggs) { tps += a['tp'] }; return Math.round(tps/60/1024/1024 * 100)/100"
} } } } } }

The scripted metric works great, except that it's about 4 times slower then doing just sum and then division on the client side.

I wondering if there is a way to define a script that runs on results of the aggregation. This way I can still use fast built-in sum agg then do the final calculations in the script (which will run only on several hundreds buckets, to its no biggie).

P.S. No, I can't just use avg, because I have slightly more complex things to calculate that obviously have no built-in aggregation I can use.

Topic		Replies	Views
Sum up Aggregation values? Elasticsearch	4	545	July 6, 2017
Aggregations and scripted expressions Elasticsearch	1	314	July 6, 2017
Calculation over aggregation results Elasticsearch	3	8298	July 6, 2017
Aggregate the results of all sub aggregations Elasticsearch	2	808	July 5, 2017
Script aggreation shows significant performance issue then none-script aggreation Elasticsearch	6	1040	July 5, 2017

Post-processing aggregations data

Related topics