Standard Deviations, Revisited

I'm going down one rabbit hole after another. I wonder if someone would offer some input. I have hourly data that I want to aggregate by Date and Count. Based on this data, I want to establish another column showing how many Standard Deviations the Count is away from the average. The average of the data below is 99.4, and the standard deviation is 17.4.

I've explored DataTables, TimeLion, and VisualBuilder. It seems that VisualBuilder allows some sort of sibling pipeline Standard Deviations... but I am at a loss as to how to get it to work for my data - though at this point I don't think it will! Any insight would be greatly appreciated.

Date Count SDs
2/1/2020 119 1.1
2/2/2020 105 0.3
2/3/2020 109 0.6
2/4/2020 89 -0.6
2/5/2020 75 -1.4

And perhaps an easier question... Could this be done if I had a data set that consisted of exactly one record per day - so that the initial aggregation by day wouldn't be needed?

I have hourly data that I want to aggregate by Date and Count

What does your document structure look like? Do your documents already contain aggregated data? Why not just store all data in ES?

Here is some reference material on pipeline aggregations

The documents contain labels, timestamps and volumes... pretty much as shown in the table. The data is hourly. That is all stored in ES. The standard deviation would need to be calculated - it pertains to the average Count over the selected timeframe. And the desired "SDs" column would calculate the number of Standard Deviations a particular day's Count was from the average timeframe of that period.

I suggested that a separate "daily" data set might be established if calculating standard deviations based on aggregating counts per day was not plausible.

Does this answer your question? I may be missing some nuance of what you're asking - this is all very new to me.

Many thanks for your interest!

I looked at the reference info on pipeline aggregations, and come to the following conclusions...
What I want to do can't be done through the graphic user interface, but might be doable via elasticsearch scripting. However, the code in the aggregation reference you pointed to is beyond my comprehension.

I've scoured the web, and am coming up blank... Do you know if there is noob-friendly reference material on Aggregations 101 that includes discussion and examples? To a massive degree, the the documentation I've seen on the elastic.co site is super-technical, and short of the sort of examples/discussion/explanations that might allow me to advance my understanding.

Thanks again.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.