Timelion: how-to [1] derivative [2] sum [3] groupby

Hi guys,

Am seeking for your help and advice on this one;

This is a timelion query (and the groupby is set to 5min):
$q='peer_src_ip:x.x.x.x AND ifindex_in:905', .es($q, metric='sum:bytes').multiply(8).divide(300).derivative().color(grey).lines(fill=3)
that calculates the bitrate of many (netflow) sessions.

Some background info about the dataset:
every 1min,2min,5min, etc. (is random) a measurement is collected that has the specified peer_src_ip and 'if index_in' fields and values specified.
the duration of the measurement is random, so it can last 1min to 1day, but what all measurements have in common is peer_src_ip and ifindex_in
In other words, a session is consisted of many measurements that are collected at various different intervals, and each session has a different lifetime; one may exist for 1h, another one for 1day. But at time t let's say both are present, and what is asked is to calculate the the total bits/sec metric.

What I would like to do is:
[1] calculate the derivative of each measurement
[2] then sum up all the derivatives

Instead of that, in the screenshot you can see that first i do a sum of all measurements and then take the derivative (which messes up the rate value)

The closest i could get is:


The first value @11.30 is the correct one (i have double checked it), but when more than 2 points in time are used for plotting this information, the graph is not the expected one.

Could you please advise on how to graph the above info properly ?

Hi Nikos,

Are you sure you want the derivative function in this case? That will calculate the rate of change. So if your flow of bytes decreases from one measurement to the next, that will be shown as a negative value on your chart. Your first value would look OK because the derivative is the change over time, so no time has elapsed on your first data point.

Maybe you could paste a few lines of your data here?

Thanks,
Lee

thanks @LeeDr.

there are 2 cases:

  1. when the bytes counter describes the number of bytes that passed through during the time interval of the collection

  2. when the bytes counter is an incremental counter.

in case [1] derivative is not needed.
in case [2] derivative is needed.

I wish it would be doable to write sth like:
where it says sum in the metric definition to have a function derivative, and then sum all derivatives, but i think is not possible, right ?

unfortunately, i cannot share any data

Ok, doing my best to understand here, but I'm a bit lost. Do you have several IPs you're trying to sum? Summing derivatives works like this:

(.es(IP:1.1.1.1).derivative(), .es(IP:2.2.2.2).derivative(), .es(IP:3.3.3.3).derivative()).sum()

Although you can write a shorter form as:

(.es(IP:1.1.1.1), .es(IP:2.2.2.2), .es(IP:3.3.3.3)).derivative().sum()

And in versions of timelion installed within the last week or so:

.es(q=IP:1.1.1.1, q=IP:2.2.2.2, q=IP:3.3.3.3).derivative().sum()

If this isn't what you're looking for it might help to provide what you'd expect to see as well as some sample data.

Thanks @Rashid_Khan for looking into this.

I have used the first 2 versions of the timelion statement and they are correct, even when i do sum(@1,@2) in case of different charts, i get the same correct result.

But when the above statements are valid ?
when the query and metric defined concern 1 time series.
For example, when there is 1 time series that corresponds to q=IP:1.1.1.1 and metric='avg:bytes'

What will happen though when you remove the specifier IP:1.1.1.1 which actually translates to *, which in return means that you have many time series where you have to apply the derivative and then do the sum, the produced graph is not as expected.

i'll try to share a reduced set of data to exhibit the above.

Oh, I see what you're saying. You need to create a timeseries for every possible IP, then take the derivative of each of those, then sum each at every point, resulting in a single series

You can try the following, but its going to be expensive, might not actually succeed, and could cause server side instability especially if you have lots of IPs:

.es(q='ifindex_in:905', split='IP:0', metric="avg:bytes").derivative().sum()

The split='IP:0' is the expensive part. This is going to pull back a series for every IP, run the derivative of each of them, then sum them all up to produce the single series you're looking for. If you don't need every IP, you could change it to split='IP:5' to get, say, the top 5 IPs.

thanks @Rashid_Khan . indeed i did notice the split (terms) when announced in twitter :slight_smile:
as at the moment i am figuring out also the relationships of the data am having can you possibly advise;

if the specifier for the top 5 from the metric point of view ? in other words, it returns the 5 max values of the metric bytes ordered ?

i know that you are super busy, but just wanted to ask you if there are any plans to be able to use any callback in many time series (as specified from the query) and not from one (avoiding thus the use of the terms filter) ? Or it is out of scope ?

*sth like sum by in Prometheus

It will return the top 5 by number of times the term occurs in the set. Elasticsearch is removing ordering by a sub aggregation, (#17588) so Im not going to implement it in timelion only to have it removed.

I'm not sure what you mean by call back in many time series, but no, no plans for any sort of callback functionality immediately.

That said, some examples of how it functions would be useful, I haven't used Prometheus. My goal however is to keep things simple and avoid any programmery type stuff :slight_smile:

Yes. Indeed you are right. Thanks a lot @Rashid_Khan