95th percentile aggregation for time series like documents

nroccolsw · May 6, 2020, 11:36am

Hi there,

I am struggling with elastic search (and finally Grafana) to display the 95th percentile value for a given time period.

Consider the following setup:

I have an an index called traffic:

POST /traffic/_search

This index stores regular time series with exactly 5 minute intervals.

The document contains the following fields (left out others for brevity:

    {
        "@timestamp": "2020-04-02T00:00:00Z"
        ....
        "bytesInPerSecond": 1237832,
        "bytesOutPerSecond" 1232922,
        "interface": "eth0",
        "server": "my-db-server",
        ....
    },
    ....
    {
        "@timestamp": "2020-04-02T00:05:00Z"
        ....
        "bytesInPerSecond": 898239,
        "bytesOutPerSecond" 892,
        "interface": "eth1",
        ....
        "server": "my-db-server",
    }

I would like to have elastic search give me the 95th percentile for a given month (in my example for April).

    {
      "size": 0,
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "@timestamp": {
                  "gte": 1585699200000,
                  "lte": 1588291120000,
                  "format": "epoch_millis"
                }
              }
            },
            {
              "query_string": {
                "analyze_wildcard": true,
                "query": "server:my-db-server"
              }
            }
          ]
        }
      },
      "aggs": {
        "prepare_the_data_aggregation": {
          "date_histogram": {
            "interval": "5m",
            "field": "@timestamp",
            "min_doc_count": 0,
            "extended_bounds": {
              "min": 1585699200000,
              "max": 1588291120000
            },
            "format": "epoch_millis"
          },
          "aggs": {
            "in": {
              "sum": {
                "field": "bytesInPerSecond"
              }
            },
            "out": {
              "sum": {
                "field": "bytesOutPerSecond"
              }
            }
          }
        },
        "95th_in": {
          "percentiles_bucket" : {
            "buckets_path": "prepare_the_data_aggregation>in",
            "percents": [95]
          }
        },
        "95th_out": {
          "percentiles_bucket" : {
            "buckets_path": "prepare_the_data_aggregation>out",
            "percents": [95]
          }
        }
      }
    }

The above query works but returns all the data for the 3 aggreggations: prepare_the_data_aggregation, 95th_in and 95th_out.

Especially the data for the first aggregation prepare_the_data_aggregation is very large as it contains all the 5 minute data points for the entire month .

The only information I need is the result of 95th_in and 95th_out. Is there a way for me tell elastic search that I only want those, and not the results of prepare_the_data_aggregation?

Since this relies on Percentiles Bucket Aggregation which is a form of Pipeline Aggregations, do you know if this kind of querying is also support via Grafana?

Thanks a lot for this amazing product.

A_B · May 6, 2020, 2:23pm

Hi @nroccolsw,

do you just want to visualize this in just any tool or do you need to be able to query the 95th percentile from Elasticsearch to use somewhere?

If you just need to see this then a Timelion visualization in Kibana could probably show this from the raw data.

I know noting about Pipeline Aggregations so can't comment on that

nroccolsw · May 6, 2020, 4:39pm

Hi A_B,

I primarily need this via the http api to use it from our application code. So it is not ‘just visualizing’.

A_B · May 6, 2020, 4:51pm

Ok, then I know nothing that might help

nroccolsw · May 13, 2020, 2:22pm

Is there maybe someone else here that can help me with my question?

system · June 10, 2020, 2:22pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to get 95th percentile and 99th percentile for response-times? Elasticsearch elastic-stack-monitoring	4	5035	June 5, 2019
How to get the Top Hit result from the aggregated 95th percentile Elasticsearch	4	332	January 15, 2019
Subtraction of 90th and 10th percentile (percentile range) Elasticsearch	2	406	October 7, 2019
99th Percentile of index rate Kibana	2	553	August 14, 2023
99th percentile and 95th percentile aggregation of response time is always coming higher than normal aggregation for the response time Elasticsearch	1	448	March 9, 2018

95th percentile aggregation for time series like documents

Related topics