Hi everyone! I'm a relative newcomer to a project that has a big Elasticsearch cluster with netflow data in it. We've been tasked with coming up with analytics, and one that has been suggested is some way of comparing a total value (for example, packets sent) for a day to previous days or an average of 30 days to find a degree of change.
Now I've worked a bit with aggregations in Kibana, so I know there are ways to sum up all the values in hits that have the correct date. The question is how to take that sum and compare it to another sum to produce something that represents the change. So the process I'm looking to produce is like this:
Do a search for all the hits that relate to the current day
Add up all the values in those hits for packets
Store that value (somewhere)
Do the same with the previous day
Compare the two totals to get a new value representing the change "since yesterday"
Display the change (maybe in Kibana?)
One person on our team recommended the use of Nested Aggregations, but I'd love to hear how some experienced users might approach this. What method would you use to try to work this out? Have you done something similar before?
I'm sorry the request isn't very specific; as I mentioned at the beginning, I'm pretty new to this, so I'm not sure what information would be needed. If there's something else that would be useful, let me know and I'll try to post it. I appreciate any advice. Thanks!
You could use a nested pipeline aggregation. The derivate aggregation will give you exactly what you're looking for: the change from one bucket (yesterday) to another (today). The docs have a nice example.
The derivative pipeline aggregation is also exposed in the Kibana user interface when building a visualization:
Okay, it looks like abdon's suggestion to use the Derivative function was successful at showing the difference from one moment to the next (thank you again!). Now my question is if there's a way I can open up that function to more complex calculations. So whereas the function for derivative is a basic t(n+1) - t(n) (I THINK I wrote that right...), is there a way to access the function so that I could calculate other things (like a percentage change, for example)? In other words, right now Derivative aggregations are a fixed operation. I want to find some kind of editable function. Does that exist?
While I was awaiting a response, I found out about the bucket script aggregation. I've been playing around with it, and I might be able to work out SOMETHING that meets my goals. But it's a bit disheartening to hear it's not available in Kibana, since that was our ultimate goal. Perhaps we'll have to reconsider the goal. Hm.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.