TimeLion's MovingSTD() vs Kibana's Standard Deviation


(Forewarned) #1

What's the difference between how TimeLion's moving standard deviation works vs Kibana's standard deviations?

I thought the movingstd() would act similarly to the movingAve() and smooth out the average values. In the case below, the window for movingstd() is 2, and I multiplied the standard deviation by 2 then added it to the .es(index="someIndex", q='*', metric='avg:field'). When I compare the standard deviation graphs side-by-side, they don't really look much alike and the values aren't even close to each other (the Kibana graph is roughly 7000 units higher).

Am I misunderstanding something or missing something here?


(Matt Bargar) #2

@Rashid_Khan?


Standard Deviation in Timelion
(Marco Bertani-├śkland) #3

If I understand things correctly, you can not compare these calculations "out of the box".

Kibana's standard deviation uses the buckets in the aggregation to calculate the standard deviation. That is, you have all your data in your bucket and you calculate the standard deviation, no extra parameter is necessary to make the calculation. Here I'm a bit unsure if they are calculating the Uncorrected sample standard deviation or the Corrected sample standard deviation (see Wikipedia's standard deviation) depending on you either divide by the total number of data points, or total number of data points minus 1. One would have to take a closer look at the code implemented. But it is irrelevant for the discussion for now.

TimeLions's standard deviation doesn't use the same data. You are interested in a time window of a certain size, and you calculate the Corrected sample standard deviation in that window. That is, your bucket with data is now defined by the time window you specify. So this would be something like a pipeline aggregation (aggregate on an already aggregated data).

In your example above, when you use Kibana you calculate the standard deviation of each bucket of data. But in timelion, you first calculate the average of each bucket and then calculate the moving standard deviations of that average (not of your original data in the buckets).

Having said that, yes there is an issue with how TimeLion is calculating the standard deviation. Take a look at this GitHub issues:
https://github.com/elastic/kibana/issues/9792
https://github.com/elastic/timelion/issues/177


(system) #4