New to ES/Compare daily data to previous date's data

Hi everyone! I'm a relative newcomer to a project that has a big Elasticsearch cluster with netflow data in it. We've been tasked with coming up with analytics, and one that has been suggested is some way of comparing a total value (for example, packets sent) for a day to previous days or an average of 30 days to find a degree of change.

Now I've worked a bit with aggregations in Kibana, so I know there are ways to sum up all the values in hits that have the correct date. The question is how to take that sum and compare it to another sum to produce something that represents the change. So the process I'm looking to produce is like this:

  • Do a search for all the hits that relate to the current day
  • Add up all the values in those hits for packets
  • Store that value (somewhere)
  • Do the same with the previous day
  • Compare the two totals to get a new value representing the change "since yesterday"
  • Display the change (maybe in Kibana?)

One person on our team recommended the use of Nested Aggregations, but I'd love to hear how some experienced users might approach this. What method would you use to try to work this out? Have you done something similar before?

I'm sorry the request isn't very specific; as I mentioned at the beginning, I'm pretty new to this, so I'm not sure what information would be needed. If there's something else that would be useful, let me know and I'll try to post it. I appreciate any advice. Thanks!

You could use a nested pipeline aggregation. The derivate aggregation will give you exactly what you're looking for: the change from one bucket (yesterday) to another (today). The docs have a nice example.

The derivative pipeline aggregation is also exposed in the Kibana user interface when building a visualization:

image

1 Like

I'll take a look at that, abdon. Thank you for your advice!

Okay, it looks like abdon's suggestion to use the Derivative function was successful at showing the difference from one moment to the next (thank you again!). Now my question is if there's a way I can open up that function to more complex calculations. So whereas the function for derivative is a basic t(n+1) - t(n) (I THINK I wrote that right...), is there a way to access the function so that I could calculate other things (like a percentage change, for example)? In other words, right now Derivative aggregations are a fixed operation. I want to find some kind of editable function. Does that exist?

Some of things you're trying to achieve could perhaps be done with a bucket script aggregation. The docs have an example where a script is used to calculate monthly sales as a percentage of the total sales. However, you may not be able to do everything you want to do with a script (your example of a percentage change would be hard to implement I think). And, bucket script aggregations are not exposed in the Kibana UI.

1 Like

Thank you again, abdon!

While I was awaiting a response, I found out about the bucket script aggregation. I've been playing around with it, and I might be able to work out SOMETHING that meets my goals. But it's a bit disheartening to hear it's not available in Kibana, since that was our ultimate goal. Perhaps we'll have to reconsider the goal. Hm.

Either way, thanks again for the input!

You may want to check out the Kibana Time Series Visual Builder: https://www.elastic.co/guide/en/kibana/current/time-series-visual-builder.html. This does expose some scripting options.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.