Hi there,
I am struggling with elastic search (and finally Grafana) to display the 95th percentile value for a given time period.
Consider the following setup:
I have an an index called traffic:
POST /traffic/_search
This index stores regular time series with exactly 5 minute intervals.
The document contains the following fields (left out others for brevity:
{
"@timestamp": "2020-04-02T00:00:00Z"
....
"bytesInPerSecond": 1237832,
"bytesOutPerSecond" 1232922,
"interface": "eth0",
"server": "my-db-server",
....
},
....
{
"@timestamp": "2020-04-02T00:05:00Z"
....
"bytesInPerSecond": 898239,
"bytesOutPerSecond" 892,
"interface": "eth1",
....
"server": "my-db-server",
}
I would like to have elastic search give me the 95th percentile for a given month (in my example for April).
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gte": 1585699200000,
"lte": 1588291120000,
"format": "epoch_millis"
}
}
},
{
"query_string": {
"analyze_wildcard": true,
"query": "server:my-db-server"
}
}
]
}
},
"aggs": {
"prepare_the_data_aggregation": {
"date_histogram": {
"interval": "5m",
"field": "@timestamp",
"min_doc_count": 0,
"extended_bounds": {
"min": 1585699200000,
"max": 1588291120000
},
"format": "epoch_millis"
},
"aggs": {
"in": {
"sum": {
"field": "bytesInPerSecond"
}
},
"out": {
"sum": {
"field": "bytesOutPerSecond"
}
}
}
},
"95th_in": {
"percentiles_bucket" : {
"buckets_path": "prepare_the_data_aggregation>in",
"percents": [95]
}
},
"95th_out": {
"percentiles_bucket" : {
"buckets_path": "prepare_the_data_aggregation>out",
"percents": [95]
}
}
}
}
The above query works but returns all the data for the 3 aggreggations: prepare_the_data_aggregation, 95th_in and 95th_out.
Especially the data for the first aggregation prepare_the_data_aggregation is very large as it contains all the 5 minute data points for the entire month .
The only information I need is the result of 95th_in and 95th_out. Is there a way for me tell elastic search that I only want those, and not the results of prepare_the_data_aggregation?
Since this relies on Percentiles Bucket Aggregation which is a form of Pipeline Aggregations, do you know if this kind of querying is also support via Grafana?
Thanks a lot for this amazing product.