Hello, and thanks in advance for any help.
I'm working on a bandwidth usage charting tool where I need to get a monthly percentile calculation, and am most of the way there - Here's what I have so far:
An Elastic Stack receiving Netflow data from a pfSense router. I have the first calculation that I need done, here's a screenshot of the result. For purposes of explaining my problem, I've zoomed way in to 5x points in my graph:
I have two plots on that graph - The blue one is exactly what I want - It's the sum of all data usage in 5 minute intervals.
For the percentile calculation, the red line is as close as I've gotten, and it helps highlight the issue - It's taking the 95th percentile of the individual netflow entries - What I need is the 95th percentile of the 5-minute "buckets", or the points that make up the blue line.
95th percentile gets a bit weird for a plot with only 5 points, so lets say I was looking for the 75th percentile - In this example, I'd be looking for "384387", which is the 2nd highest value out of five.
On continuing to work on this, I was able to create a new visualization (Data Table type) that also makes me feel like I'm really close. It's basically the same thing as my TimeLion graph, but as a table. This also highlights the issue in a similar way: It's very easy to get a percentile of the individual values that make up each Bucket, but getting a percentile of the Buckets themselves proves challenging:
So far, I haven't found any way to accomplish this with Kibana and TimeLion. Any help would be greatly appreciated!
EDIT: one more clarification I thought I should add is that where I'm going with this is having 1 fixed percentile per month (and then down the road it will be one percentile per month per source IP, but I think I'm already good on knowing how to break it out by that). That percentile will be calculated from all of the 5-minute buckets for that month, and ideally I can just put that single value on the chart as a horizontal line.
here's the code from the first screenshot:
.elasticsearch(
index="netflow-*",
metric="sum:netflow.bytes",
split="netflow.src_addr:1",
kibana=true)
.label(
regex="^.* netflow.src_addr:(.+) > .*$",
label="$1")
.yaxis(
label="bytes / sec",
min=0)
,
.elasticsearch(
index="netflow-*",
metric="percentiles:netflow.bytes:95",
split="netflow.src_addr:1",
kibana=true)
.label(
regex="^.* netflow.src_addr:(.+) > .*$",
label="$1")