I'm going down one rabbit hole after another. I wonder if someone would offer some input. I have hourly data that I want to aggregate by Date and Count. Based on this data, I want to establish another column showing how many Standard Deviations the Count is away from the average. The average of the data below is 99.4, and the standard deviation is 17.4.
I've explored DataTables, TimeLion, and VisualBuilder. It seems that VisualBuilder allows some sort of sibling pipeline Standard Deviations... but I am at a loss as to how to get it to work for my data - though at this point I don't think it will! Any insight would be greatly appreciated.
And perhaps an easier question... Could this be done if I had a data set that consisted of exactly one record per day - so that the initial aggregation by day wouldn't be needed?
The documents contain labels, timestamps and volumes... pretty much as shown in the table. The data is hourly. That is all stored in ES. The standard deviation would need to be calculated - it pertains to the average Count over the selected timeframe. And the desired "SDs" column would calculate the number of Standard Deviations a particular day's Count was from the average timeframe of that period.
I suggested that a separate "daily" data set might be established if calculating standard deviations based on aggregating counts per day was not plausible.
Does this answer your question? I may be missing some nuance of what you're asking - this is all very new to me.
I looked at the reference info on pipeline aggregations, and come to the following conclusions...
What I want to do can't be done through the graphic user interface, but might be doable via elasticsearch scripting. However, the code in the aggregation reference you pointed to is beyond my comprehension.
I've scoured the web, and am coming up blank... Do you know if there is noob-friendly reference material on Aggregations 101 that includes discussion and examples? To a massive degree, the the documentation I've seen on the elastic.co site is super-technical, and short of the sort of examples/discussion/explanations that might allow me to advance my understanding.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.