Query to calculate successrate


(Mathias) #1

Hi,

I am struggling with elasticsearch aggregation queries.
I would like to calculate successrate per hour between two different events.
I have one event with a field "name" with the value "attempt" each time an attempt is done.
Then I have two other events indicating failures and successes. Field is still "name" and value is either "Failure" or "Success". So for each attempt there is either a failure event or a success event.
So success rate is "number of successes / number of attempts".
How do I write a query that aggregates all attempts, failures and successes over an hour and then return the calculated success rate?

Kinds Regards
Mathias


(Abdon Pijpelink) #2

You could use two filter aggregations to get the number of successes and the number of attempts. Using a bucket_script aggregation, you can then calculate the ratio between these two numbers.

So, given this index definition:

PUT test
{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "doc": {
      "properties": {
        "name": {
          "type": "keyword"
        },
        "timestamp": {
          "type": "date"
        }
      }
    }
  }
}

And these documents:

PUT test/doc/1
{
  "name": "attempt",
  "timestamp" : "2017-08-31T13:15:30Z"
}

PUT test/doc/2
{
  "name": "Failure",
  "timestamp" : "2017-08-31T13:16:30Z"
}

PUT test/doc/3
{
  "name": "attempt",
  "timestamp" : "2017-08-31T13:17:30Z"
}

PUT test/doc/4
{
  "name": "Failure",
  "timestamp" : "2017-08-31T13:18:30Z"
}

PUT test/doc/5
{
  "name": "attempt",
  "timestamp" : "2017-08-31T13:19:30Z"
}

PUT test/doc/6
{
  "name": "Success",
  "timestamp" : "2017-08-31T13:20:30Z"
}

You could use this aggregation to get the hourly success rate:

GET test/_search
{
  "size": 0,
  "aggs": {
    "all": {
      "date_histogram": {
        "field": "timestamp",
        "interval": "hour"
      },
      "aggs": {
        "attempts": {
          "filter": {
            "term": {
              "name": "attempt"
            }
          }
        },
        "successes": {
          "filter": {
            "term": {
              "name": "Success"
            }
          }
        },
        "succesrate": {
          "bucket_script": {
            "buckets_path": {
              "attempts": "attempts._count",
              "successes": "successes._count"
            },
            "script": "params.successes / params.attempts"
          }
        }
      }
    }
  }
}

Which would tell you the success rate in our one hour of data is 0.33 (33%):

buckets": [
        {
          "key_as_string": "2017-08-31T13:00:00.000Z",
          "key": 1504184400000,
          "doc_count": 6,
          "successes": {
            "doc_count": 1
          },
          "attempts": {
            "doc_count": 3
          },
          "succesrate": {
            "value": 0.3333333333333333
          }
        }
      ]

(Mathias) #3

Many thanks!

I love complete examples.

I have a follow up question?

What do I need to do to make es return only successrate?
I tried:

    "succesrate": {
      "bucket_script": {
        "buckets_path": {
          "attempts": "attempts._count",
          "successes": "successes._count"
        },
        "script": "params.successes / params.attempts"
      }
    },
  }
}

},
"size": 2,
"aggregations": "successrate"
}

but that does not work:

"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "Unknown key for a VALUE_STRING in [aggregations].",
"line": 53,
"col": 19
}
],
"type": "parsing_exception",
"reason": "Unknown key for a VALUE_STRING in [aggregations].",
"line": 53,
"col": 19
},
"status": 400
}

Is it possible to return only one value?

Thanks
Mathias


(Abdon Pijpelink) #4

Do you mean you only want to return the succesrate and not the successes and attempts values?

You could filter the response to only return you that one value. Instead of hitting GET test/_search in the last example, you could filter the response to only show you the success rate:

GET test/_search?filter_path=aggregations.all.buckets.succesrate

This would return just:

{
  "aggregations": {
    "all": {
      "buckets": [
        {
          "succesrate": {
            "value": 0.3333333333333333
          }
        }
      ]
    }
  }
}

If you don't want to use response filtering and really want to have Elasticsearch return you a single value you could use a scripted metric aggregation: https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-aggregations-metrics-scripted-metric-aggregation.html

Scripted metric aggregations are much harder to write though, and may not scale that well if you have a lot of data.


(Mathias) #5

Thanks,
I think that would be what I need but I do not get the same result as you.
I jut get an empty string.

{}

Here is my output as gfx.


(Abdon Pijpelink) #6

I don't understand why you are getting that empty response. Did you maybe rename the aggregations? I notice I had a typo in succesrate. If you fixed that typo in the aggregation name, you should also fix it in the filter_path.

If not, could you please copy&paste the exact request you're executing? (The text, not as an image).


(Mathias) #7

Hi,

I do not understand either.
I have copied your query but it response still empty:

GET test/_search?filter_path=aggregation.all.buckets.succesrate
{
"size": 0,
"aggs": {
"all": {
"date_histogram": {
"field": "timestamp",
"interval": "hour"
},
"aggs": {
"attempts": {
"filter": {
"term": {
"name": "attempt"
}
}
},
"successes": {
"filter": {
"term": {
"name": "Success"
}
}
},
"succesrate": {
"bucket_script": {
"buckets_path": {
"attempts": "attempts._count",
"successes": "successes._count"
},
"script": "params.successes / params.attempts"
}
}
}
}
}
}

Br Mathias


(Mathias) #8

I missed an "s" in aggregations

Now it is working:

GET test/_search?filter_path=aggregations.all.buckets.succesrate

Many tanks
Mathias


(Mathias) #9

One more thing,

Is this kind of query possible to visualize in kibana?
If so how?

Thanks
Mathias


(Abdon Pijpelink) #10

Yes, but you wouldn't use this bucket script aggregation for that. Take a look at the Time Series Visual Builder, a relatively new way to build visualizations in Kibana. It allows you to use a "Filter Ratio" aggregation, which you could use to plot the ratio of name:Success vs name:attempt over time:

The Visual Builder has several ways of visualizing the data. You could use for example a gauge to display the current success rate:


(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.