However I can't get it working. The sreenshots don't show all the settings and the documentation is, lets call it a little sparse, as well.
Basically filling in the data as per the metric screenshot gives me completely incorrect data. I get negative bytes and even something simple as doing a sum results in incorrect results. E.g. netflow sum is only 5KB over a 24 hours period while creating a normal metric with sum netflow shows the correct value that is a lot higher. Also I don't know how they managed to show KB/s. Working with derivatives results in negative graphs from time to time.
Any hints or tips? I was really exited from the 5.4 update as it looked like it would finally allow for events/per second to be properly handled. I'm sure its just me being an idiot but the user guide isn't really much help here.
@Sjaak01 we actually have a blog being published next week with an instructive video to build the network traffic visualization (pink/blue in your screenshot). This configuration can equally be applied to a metric visualization type. Please find the configuration necessary in the screenshots below.
Make sure the following configurations are set:
The unit for each metric is set to 1s to provide the most recent result
You can apply a positive only aggregation to ensure the value is positive
For outbound traffic, a painless script can be used transform the metric to a negative value
Under the metric options, set the data formatter for bytes to get KB/s by default
Gaps in graph even though there is constant data usage. Netflow data isn't coming in at a steady rate. There are netflow package listing large amounts of data usage, small amounts, all within one minute. Kibana should take the sum of e.g. one minute and divide by the template value /s but it's not doing that. I had the same problem with timelion until the person maintaining that was nice enough to write the mvavg plugin.
Data usage is not calculated as an average over x amount of time. It's basically the same problem as above. The timeseries isn't calculating the sum over whatever it picked as an interval and then dividing it by value /s.
Slightly unrelated by I noticed you cannot use scripted fields? I've got a scripted field that sums up my netflow.in and netflow.out fields. I can use my scripted field in other visualizations.
No data on absolute search. E.g. if I select past 30 minutes it shows data but if I then go to absolute and leave the start time as is (1pm) but change the end time to 5 minutes earlier (1:30PM -> 1:25pm) no data is displayed.
As you can see there are gaps in the graph. The noted speed is also way too high because I know for a fact it shouldn't have been much more than 1mbit.
Same timeframe as the timeseries above but this time data is displayed correctly because timelion sums the netflow fields with a automated interval. Scale_interval is set to 1s and mvavg to 1m to avoid gaps in graphs and get the correct results.
If you zoom out far enough the time series looks correct but once you get close up (in a quick test it appears to be under 4 hours) graphs and values are incorrect.
@alexf, @simianhacker, did you have time to look at this issue? I've looked at the blog entry but I don't think that will work when working with netflow data.
Another very strange issue I suddenly have is that only the time series visual is not showing any data. Not finding indexes or anything. All other visualizations are working fine.
@Sjaak01 Based on the data above with the big spikes it looks like your data doesn't need a derivative because the out_bytes seem like they are a rate already and not a cumulative number. For that chart I would just use average instead of derivative of max
As for the missing data it might be that your data is too sparse for the resolution and time range you're viewing. A bar chart might be more appropriate visualization since it will display the sparse buckets accurately. In my opinion bar charts are actually more appropriate for sparse data; line charts are ascetically pleasing when you're data is complete.
@simianhacker I've tried, max, average etc but all lead to incorrect results.
The issue is that Time series, and Timelion as well without the mvavg plugin, don't appear to calculate averages over time.
Example:
A Firewall is sending out netflow packets every three minutes or whatever you configure. What happens when for example you turn on a big download, is that every three minutes netflow will output a big packet that might include 1GB of bytes.out. Elastic will log this and in a visualization it will show as a single 1GB event.
This makes sense but in reality that 1GB figure is the total over a three minute time span.
If Time Series wants to display /s values correctly there should be a way to tell it to take the SUM of value X over Y period of time and calculate /s based on that.
In case of netflow there will also be packets coming in that are much smaller between the big one and periods with no packets at all so that will create graphs with very high peaks and gaps unless you can tell the system it needs to do its calculations over a longer period of time rather than on a event basis.
The MVAVG plugin for Timelion does that and it solves the issue of strange looking graphs.
If your incoming data rate is high enough this might not be an issue but googling how many people want to do bytes per second in Kibana but have issue because of the above is very high.
It got fixed on Timelion and I'm hoping there will be a similar fix for Time Series. I definitely think a lot of people would be happy with such a function.
Whether I pick sum or average as the first aggregation with moving average as the second, the results are much higher than they should be. This is also the case if I set the interval to 3m (my netflow output setting).
With sum and interval set to automatic I get graphs reporting 34MB/s. Interval set to 3m I get over 100MB/s.
With average I get something like 15MB/s and 4MB/s with the interval set to 3m.
It also would be nice if Time series would draw a line between data points instead of leaving it empty, causing gaps. This does not make sense if you set value /s.
The real speed was 1MB/s and the Timelion graph using mvavg does display that correctly.
Making gauges is even harder. I would expect that having the sum of out.bytes with an average of out.bytes (would be overall average and sum of out bytes I suppose) would give me the average speed but it doesn't. Instead it gives a totally unbelievable number that makes no sense.
I don't want to sound negative because I'm sure its mostly me that is the problem and the product and support are great but better documentation is needed. I have no clue what most options do and explanations and examples are thin and spread out.
Every visualization should have a explanation of all options, even if that means you're going to end up with the same explanation 20 times.
I suggest setting up a device that exports netflow and see for yourself. It's the standard in networking. I don't think you'll get good graphs or gauges, at least not in a way that the average user would think of when creating visualizations.
Edit: I don't think its possible to use scripted fields that do work with other visualizations.
Line charts are probably not the best choice for sparse data like this, I would change to a bar chart. As far as connecting the lines it's not a simple as just adding lines between the gaps; check out this Github issue: https://github.com/elastic/kibana/issues/11793
If your data is already a rate coming from Netflow then you just need to take the average (which is what it looks like). Without seeing your actual data it's hard to give you any advice on what the data is vs. how Elasticsearch is aggregating it. Can you paste in the Timelion expression you're using?
Thanks for the link, I understand the issue a bit better now.
Being able to "connect the dots" would still be very helpful as was mentioned by you and others. In my case I just cannot get data into elastic fast enough to avoid gaps. But if I show a user a bandwidth chart with gaps in it and explain them to just ignore that unless it's a very long gap (indicating down time) they will never accept that. They simple won't understand.
Given how netflow works you're never going to catch 100% of the details anyway so for my use case drawing lines between two data points, or calculating the average over x amount of time, is acceptable for me.
In my case this would result in more a more "correct" display of data compared to having gaps.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.