Post-processing aggregations data


(Zaar Hai) #1

Good day,

I have a (simplified) mapping that stores throughput metrics every second:

{  "sample": {
        "properties": {
            "timestamp": {  "type": "date" },
            "throughput": {  "type": "long" }
        }
}   }

I would like to calculate average throughput in megabytes per second over 1-minute buckets.

So far I found two ways of doing this:

1. Sum aggregation with post-processing:

{  "aggs": {
    "date_agg": {
      "date_range": {
        "field": "timestamp",
        "ranges": [
          { "from": "2015-05-25T14:50:00.000Z", "to": "2015-05-25T14:51:00.000Z" }
          .... <several hundred more buckets>
        ]
      },
      "aggs": {
        "total_throughput": {
          "sum": { "field": "timestamp" }
} } } } }

And then divide value of each bucket to (6010241024) on the client side after fetching the data from ES.

2. With a scripted metric:

{  "aggs": {
    "date_agg": {
      "date_range": {
        "field": "timestamp",
        "ranges": [
          { "from": "2015-05-25T14:50:00.000Z", "to": "2015-05-25T14:51:00.000Z" }
          .... <several hundred more buckets>
        ]
      },
      "aggs": {
        "thoughput_per_sec": {
          "scripted_metric": {
            "init_script": "_agg['tp'] = 0",
            "map_script": "_agg.tp += doc['throughput'].value",
            "reduce_script": "tps = 0; for (a in _aggs) { tps += a['tp'] }; return Math.round(tps/60/1024/1024 * 100)/100"
} } } } } }

The scripted metric works great, except that it's about 4 times slower then doing just sum and then division on the client side.

I wondering if there is a way to define a script that runs on results of the aggregation. This way I can still use fast built-in sum agg then do the final calculations in the script (which will run only on several hundreds buckets, to its no biggie).

P.S. No, I can't just use avg, because I have slightly more complex things to calculate that obviously have no built-in aggregation I can use.


(system) #2