Graphing Interface Utilization with Timelion or TVSB

Hi,

I'm trying to follow this blogpost to normalize network traffic collected by the SNMP filter (ifHCInOctects/ifHCOutOctets) on a 7.12 stack. I'm sure I had this working on a previous version 6 months or so ago.

At the first step

.es(index='log stash-snmp-*', '_exists_:router.interface.out.bytes').if(eq, 0, null, .es(index='log stash-snmp-*', metric='sum:router.interface.out.bytes'))

I'm getting the following error
Timelion Error

These interface counters have been running for a while and I've tried preserving the number by converting to integer in log stash and by formatting as integer/Bytes/bitd in the Kibana pattern but whatever I seem to do I'm seeing values such as 570.8GB in the discovery panel. I'm concerned that there won't be enough difference to graph a derivative between samples when the values are summarized as GB.

I've tried visualizing with TVSB using Avg/Derivative/Positive Only at 1s interval but get dots against an unrealistic scale.

Any help appreciated,

Nick

Hi @nickr

In TSVB there is a new Counter Rate specifically for counters like Network bytes in/ out it does the max derivative positive only all in one aggregation.

in Lens you should be able to do these Network bytes in and out in Lens as well there's a counter rate in lens now.

With respect to the gigabyte GB number formats that's the automatic formatting that you're seeing presented in Kibana I think if you look at the raw underlying number you will see that it is a long with the actual number bytes. If not you will have a problem.

I would try the TSVB or Lens with the Counter Rate before Timelion which is a bit more complicated

On gotcha can be if there is more than one interface for example you will need to add the following below the Counter Rate in TSVB.

Series Agg : Sum
Group by Term : Interface Name

Thanks for such a quick and helpful response Stephen.
I'll let you know how I go.
Cheers
Nick

Thanks Stephen, I've not got anywhere with the counter rates.

What i am trying to acheive is to have separate dashboards for each router linked to a dashboard giving details of each of its interesting interfaces. Obviously the bits in/ bits out is a fairly fundamental visualization. The SNMP records do not arrive in neat 1 min intervals but the arrivale rate varies.

At the Kibana Patterns level the in/out fields are currently both set to Bytes (default format). I've tried both Lens Visualization and TVSB Metric set to numbers and Bytes. It would appear that my options would be for Lens to set the formatting at the Kibana level whilst the TSVB allows formatting at the Kibana and custom level to convert to 'bitd'.

TSVB config is below. The IN is a clone with the 3 fields changed to 'in.bytes'

Lens config as 'Line' is showing dots or nothing

Lens is normalized at 1s whilst TSVB is set to 1s at panel level and auto at the metric level. It appears setting the interval at the metric level merely widens the base of a vertical line to become a steep-sided triangle. There is a huge discrepency between the shape and scale between the Lens and TSVB graphs but neither renders a graph that extrapolates between any 2 records. Either way I would expect that the in & out bits for the filtered interface to be around 1 & 3 Mbps respectively.

I am sure bits in/out of a filtered interface shouldn't be this hard. What am I doing wrong?

Hi @nickr

Whole lot going on here... Lets start from the Easy to the Harder (and this part assumes data is good)

  1. Lens: If you select a larger time interval in Lens I think you will see what you want to see. (insert long discussion on histogram bucket sizes) but in short Most viz is going to want about 30+ buckets to draw lines in a graph otherwise it is going to draw points... You sample size to be 2+ mins so you may need to zoom much farther out. Why you got that little section of a line I can guess, but I think that is an artifact.

I think this will work for you if you just zoom out.

In short with Lens and TSVB you want to set the interval to >= your collection interval ... So if you data is coming in about every 2 mins This is super import ant especially with counters and derivatives etc.

  1. TSVB : 1st stand corrected you don't need to use the Series Agg : Sum with the new Counter Rate

(For the record you just used Plain ole Aggregation : Sum ... not Aggregation : Series Agg with Function: Sum ... and I was thinking wrong on that that. The method is is how to show the SUM of ALL the interfaces into a single value, not each interface on its each line , apologies on that)

  1. Insert the same discussion on histogram buckets. (we will come back to your actual data in a bit)

In short with TSVB you want to set the interval to >= your collection interval ... So if you data is coming in about every 2 mins This is super import ant especially with counters and derivatives etc.

So here is what mine looks like....

And I made sure the labels were correct...

Finally and perhaps most important...

I am not an SNMP expert and you say that the data is coming in in uneven interval. I am not sure how you are ingesting the data. I think it should be ok if the differences in the timestamp is equal to the difference of the actual data collected.

i.e. if one comes in in 2.5 mins that the value represents that counter over 2.5 minutes... BUT if you are telling me each SNMP represents 2 mins but may coming in in 45 sec, 3 min then 2 mins that may not work.

Let me know what you find out / think.

Hi @stephenb

Sorry for the delay I've been playing. To answer your question, I'm using the SNMP input with a default polling interval of 30sec. I've created the top visualizations with the counter rate you suggested. The metrics are repurposed copies of the graph, so all the settings are the same. The second row was using the longer method setting 95th percentile, derivate, positive only, with the metrics similarly copied.

For the top row the panel settings are set '>=1m' and the data options at 1 second. The second row data is still 1 second but I'm playing with the panel settings to smooth out the 95th percentile graph.

I've 2 questions, firstly what should I set the metrics to to ensure that the max covers the full time range selected. I've tried changing the panel options from 'Last Value' to 'Entire Time Range' and I get no value ('-/s'). As things stand my Max metrics are less than the graph peaks. For example, bits in is 541KB/s whilst the graph has a peak > 1MB.

Secondly, I have formatted the Byte fields at the Kibana pattern level as 0,0 bits as the interfaces are decimal (1000 rather than 1024 bits) and it makes no sense to end up with a fraction of a bit/byte.

Despite this the values are still in KiloBytes not Kilobits. The data options are set to 'Bytes', if I create both as custom fields '0,0 bitd' the graph fails ....

but it works for the metrics?? Which is a copy of the graph??

TSVB metric

Thanks in advance

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.